Fractal Dimension Of A Hypertext Space

The crux of my preference for Smashed Together Words over the Free Link approach to Automatic Linking in a Wiki is that I believe it increases the "bushiness" of the space.

In 1997 Leo Egghe (L Egghe) wrote an article on the "Fractal And Informetric Aspects Of Hypertext Systems".

In the paper he defines the Fractal dimension of a HyperText based on the number of (internal) links it includes:

• let `n` denote the total number of pages, and
• `m` the average number of "HLs" (hypertext links) per page
• I'm pretty sure that that count should be only
• count only explicit Wiki Name references, not automatic BackLinks.
• unique cases of a link on a given page (so if a `Page A` has 3 separate links to `Page B` in it (which happens in a wiki when you use the same Wiki Name multiple times in a page), then it should only count as 1 for that page)
• links to pages that actually already exist (vs the Wiki case of a link to create-a-page-with-that-name)
• the fractal dimension `D = ln(n) / [ln(n) + ln((1+m) / m)]`
• Python: `fractal_dimension = log(num_pages)/(log(num_pages) + log((1 + avg_frontlinks) / avg_frontlinks))`

I should really work on writing some code to calculate this for various spaces... (I'm kinda surprised this hasn't been done for Wikipedia already)

Aug08'2014: realized my WikiGraph code got me 90% of the way... ended up with:

``````number of pages:  16506
fractal dimension:  0.982089872391
``````

Let's compare to a fake WebLog (I've made a spreadsheet to calculate these):

``````number of pages:  1000
fractal dimension: 0.862782681486289
``````

Let's check 2 variations on that fake WebLog:

• `avg_frontlinks = 1.0 -> fractal dimension: 0.9088072522638707`
• `avg_frontlinks = 0.1 -> fractal dimension: 0.7423183624341485`

And let's compare to a smaller version of my WikiLog (note that if I had fewer pages, then I'd have fewer links/page because many WikiWords wouldn't hit matches - but we'll ignore that for now):

• `n=1000, avg=5.16 -> fractal dimension: 0.975002217`

Sept18'2014: was going to write an HTML-scraper to handle my Private Wiki, but decided to just grab the raw-text, since that's much easier.

• `n=2687, avg = 3.74 -> fractal dimension: 0.970874964094`

Hmm, what's an upper-bound? How much is too bushy? Fake scenario:

• `n=1000, avg=100 -> fractal dimension: 0.99856`

• `n=1000, avg=3 -> fractal dimension: 0.960`