Since delicious was terminating its service, I needed a new bookmark syncing service. I’ve switched to Firefox sync which finally seemed mature enough and, indeed, it works like a charm. It synchronized passwords, history perfectly between computers and android devices.
Unfortunately, documentation and (working) examples are scarce. The storage server itself is rather simpel and doesn’t need a lot of explanation, but the way the Sync code uses the storage definately does. There is a “weave.py” implementation from mozilla, but it’s rather outdated and won’t work with the current storage version used (version 5).
Nonetheless I managed to figure things out and get a working client that, at least, fetches bookmarks. This should be a good start for anyone trying to access Firefox sync data.
The full code can be found on github. I’ll highlight some specific details in this post.
When accessing the sync storage, you will need three things:
- Your username. This is usually an email address
- Your password.
- Your passphrase. This is what is used to encrypt all your data before storing it. It has the format x-xxxxx-xxxxx-xxxxx-xxxxx where x can be a number or letter (with some limitations). If you lose this, you lose your data.
- Get your (passphrase encoded) private key (../storage/keys)
- Decode your private key using your passphrase
- Fetch data from the storage backend (e.g. ../storage/bookmarks/xyzxyz123)
- Decode it using your private key
This also explains why there is no webinterface offered by mozilla - they simply can’t access your data!
request structureThe overall url structure (when accessing sync data) is as follows:
https://backend/<api>/<username>/<collection>backend is the base url to a backend node api is the api version. 1.0 is currently used. username is your username, encoded (see below) collection is the collection you’re accessing. Known collections include:
- form history
- open tabs
Decoding the passphraseFirefox Sync generates a passphrase for you and creates a base32 encoded string out of this. For readability, it is split into 6 hyphen-separated parts with ‘l’ replaced by ‘8′ and ‘o’ replaced by 9. The following method transforms it back into it’s original form:
def decode_passphrase(p): def denormalize(k): """ transform x-xxxxx-xxxxx etc into something b32-decodable """ tmp = k.replace('-', '').replace('8', 'l').replace('9', 'o').upper() padding = (8-len(tmp) % 8) % 8 return tmp + '=' * padding return base64.b32decode(denormalize(p))
Encoding the usernameThe username needs specific encoding. Specifically, you need to turn it into a SHA1 hash and base32 encode it:
def encode_username(u): return base64.b32encode(hashlib.sha1(u).digest()).lower()When doing API calls to the backend, you HTTP authenticate using this encoded username and your password (not passphrase!).
Finding a nodeBefore fetching data, you’ll need to get the base-url of a backend node:
def get_node(self): url = self.server + '/user/1/' + self.username + '/node/weave' r = requests.get(url, auth=(self.username, self._password)) return r.read()
- weave.py, outdated, non-functional client code
- Weave client API
- Weave storage API 1.0
- Storage format, shows differences in version 5.
- Working sample python client
Hi, I had to change line 28 (in the script hosted on github) because it didn’t work. old: url = self.server + ‘/user/1/’ + self.username + ‘/node/weave’ new: url = self.server + ‘/user/1.0/’ + self.username + ‘/node/weave’
Comment by unodipassaggio — Sep 12, 2011 9:36:43 PM
hi, shouldn’t it be r.content instead of r.read()? At least it is working for me this way and I get an error otherwise. Thx for the code! Saved me a lot of time.
Comment by geier — Nov 28, 2011 11:59:30 PM