(2018-03-28) Kazemi Bracketmemebot
Darius Kazemi: bracket-meme-bot (Daemon)
ConceptNet, as you might expect, is more about general concepts like "gloom" and "razorfish" and "pencil" than it is about hyper-specific pop culture stuff. Whereas Wikipedia is replete with pop culture, and that's kind of key to this whole thing.
My first thought was to look at Wikipedia's famous "Lists".
But if we look at other lists, they're not formatted in a consistent way at all
Fortunately, Wikipedia has another, better defined way of categorizing things: via the aptly-named Category.
But wait... Marie Curie herself is a Category!!
What want is lists of things like: Disney films, soccer players, buildings in NYC, and cat breeds. What do all those things have in common? Well: they all have a plural noun in them
but it turns out that DBPedia doesn't have a way to just... grab a random category.
the other major option for getting data from Wikipedia, which is using the MediaWiki API
there is no built-in way to grab a "random" category,
it will tell us how many Pages are in a category but it won't let us filter by that
So our proposed algorithm is incompatible with the software we have at hand. At this point we have two options:
- adjust the algorithm to fit the tech
- write new tech that does what we want
HERE IS WHERE A VAST MAJORITY OF ENGINEERS COMPLETELY SCREW UP!
The right decision here, and I mean "correct" or "just", is to simply change our brilliant initial design and move on. It will probably change the outcome of the project and what it looks like. This is okay.
Edited: | Tweet this! | Search Twitter for discussion
No backlinks!
No twinpages!