(2012-10-06) Simple Note Synch With Python
Am writing some Python code to synch my Private Wiki notes with the SimpleNote cloud. See 2012-07-13-SimplenoteAndNotationalvelocityDumpingLocalNotes for background.
Part of me thinks I should be looking for generic Data Synch code to steal from.
Oct05 - do some initial code sketching. Then decide to start with the "special case" of the first time scraping of what's already in the cloud. But running into issues with API failing. See some other folks complaining on Twitter also, so decide to ignore it for now.
Try running first-scrape code again, now it works fine and builds a local pickle dictionary of what's in the cloud. (Oct07)
Get a handle on any Time Zone issues for comparing versions.
- confirm (Oct07) that for a synched file/note, the cloud-note-modifydate and local-file-getmtime values match
- confirm that the
time.time()
value matches the local-file-getmtime value
Realize that I've stored all the metadata and even the content in my local cache. Seems excessive to have the content, and even most of the metadata fields seem pointless. But not going to worry for now.
Have successfully pushed a couple months of added/updated local files into cloud.
- put my code [into](https://github.com/Bill Seitz/simplenote_synch) GitHub
Realize that I have a lot of dupes in the cloud (and in my Tablet FlickNote)!
- I even see a few triples!
- It looks like all the non-uniques are new additions I pushed to the cloud, whereas if I pushed an update it stayed unique.
- It's hard to debug an issue like this because the web view doesn't tell me a note's key or anything else. Nor any sort of log...
- Ah, it's because some early bugs caused me to have to run my code multiple times, and I didn't have good enough error checking in there to deal with discrepancies between my local map and the cloud cache.
- Have 2131 notes in cloud, vs 2086 local.
- Start running de-dupe code. Keeps dying before it finishes. But I am saving my new map along the way, so maybe if I tweak my code I don't have to re-run those same nodes...
- Looks like no deleting (de-deduping) has happened so far. But my map has "1704 notes, 1664 keys" which makes it sound like it has 40 dupes in it! Or maybe something else is going on?
- So my immediate-next-step should be to review that map file and see if there are dupes, and figure out why. Then tweak my scraper code to not have to re-scrape the same stuff I got before the previous run died.
- definitely have dupes in my map - but why? Maybe a bug that's keeping me from running the de-dupe part? Update: yep, had test backwards!
- deleted the 40 dupes from map and cloud. (Oct08)
- tweaked my de-dupe code to not re-check the notes I'd already scraped, just scrape and dedupe the rest. Now all maps and cloud and files are consistent. Update GitHub.
Grr, FlickNote doesn't seem to be purging the deleted notes. Update: ah, it just took a few synchs to get fully updated or something. All good now.
Process for now: only edit on Lap Top, not web or FlickNote.
Refine code to push changes from Lap Top to cloud. Done Oct10. Update GitHub.
Jan07'2013 update: notice my synch log says: using cached pickle of raw notes; 2119 raw notes in cloud list; doing pickleread; last_synch_finish 1357574983.0; 2173 notes (according to map); 2173 keys (unique names, according to map) and my local directory has 2093 files. So something ugly is going on.
- back on Nov17 the synch said: using cached pickle of raw notes; 2119 raw notes in cloud list; doing pickleread; last_synch_finish 1353095805.0; 2130 notes (according to map); 2130 keys (unique names, according to map)
Future
- start worrying about synching other direction.
Edited: | Tweet this! | Search Twitter for discussion