<?xml version="1.0" encoding="UTF-8"?> <rss
version="2.0"
xmlns:content="http://purl.org/rss/1.0/modules/content/"
xmlns:wfw="http://wellformedweb.org/CommentAPI/"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:atom="http://www.w3.org/2005/Atom"
xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
><channel><title>Datavisualization.ch &#187; DataMining</title> <atom:link href="http://datavisualization.ch/tag/datamining/feed/" rel="self" type="application/rss+xml" /><link>http://datavisualization.ch</link> <description>Datavisualization.ch is the premier news and knowledge resource for data visualization and infographics.</description> <lastBuildDate>Thu, 10 May 2012 07:22:40 +0000</lastBuildDate> <language>en</language> <sy:updatePeriod>hourly</sy:updatePeriod> <sy:updateFrequency>1</sy:updateFrequency> <generator>http://wordpress.org/?v=3.3.2</generator> <item><title>The Google Books Ngram Viewer</title><link>http://datavisualization.ch/tools/the-google-books-ngram-viewer/</link> <comments>http://datavisualization.ch/tools/the-google-books-ngram-viewer/#comments</comments> <pubDate>Tue, 21 Dec 2010 09:12:41 +0000</pubDate> <dc:creator>Peter Gassner</dc:creator> <category><![CDATA[Tools]]></category> <category><![CDATA[BigData]]></category> <category><![CDATA[DataMining]]></category> <category><![CDATA[Research]]></category><guid
isPermaLink="false">http://datavisualization.ch/?p=6678</guid> <description><![CDATA[The Google Books Ngram Viewer shows the power of visualization: instead of offering a huge but abstract data set, Google created a simple visualization tool that shows the data and makes it easily queryable.]]></description> <content:encoded><![CDATA[<a
href='http://datavisualization.ch/tools/the-google-books-ngram-viewer/' title='The Google Books Ngram Viewer' class='share_image'><img
src='http://datavisualization.ch/wp-content/uploads/2010/12/teaser.png' title='The Google Books Ngram Viewer' alt='The Google Books Ngram Viewer' /></a><p>One aspect that the release of the <a
href="http://ngrams.googlelabs.com/">Google Books Ngram Viewer</a> last week shows really well is the power of visualization: instead of offering a huge but <em>abstract</em> <a
href="http://ngrams.googlelabs.com/graph?content=dataset%2Cdata+set&amp;year_start=1960&amp;year_end=2008&amp;corpus=0&amp;smoothing=3">data set</a> like <a
href="http://googleresearch.blogspot.com/2006/08/all-our-n-gram-are-belong-to-you.html">back in 2006</a>, Google created a simple visualization tool that <em>shows</em> the data and makes it easily <em>queryable</em>. It&#8217;s not as visually appealing as what people like Chris Harrison <a
href="http://www.chrisharrison.net/projects/trigramviz/index.html">have done with similar data</a>, but it doesn&#8217;t have to be! The purpose of this tool is to give first insights and spawn ideas, which can then lead to a deeper analysis.</p><p>What I find most exciting about this project, is that Google enables everyone (no programming skills necessary) to ask questions and dig into a century old corpus of accumulated wisdom in over 5 million books in 6 languages.</p><p>While playing with the Ngram Viewer and looking through other peoples&#8217; queries (click on the charts to go to the source), I noticed that there are different kinds of questions people tend to ask, so I came up with this incomplete and unscientific categorization of what the Ngram Viewer is, that I&#8217;d like to put up for discussion.</p><h3>It&#8217;s About Comparing Things</h3><p>A very simple but powerful use case of the Ngram Viewer is to compare ideas, products, concepts, etc. over time. People like to think in comparisons like &#8220;good&#8221; and &#8220;bad&#8221;, so this is an ideal entry point for people who don&#8217;t quite know what to do with this tool. As a case in point, I wanted to look at how the pie chart stacks up against other visualization methods, and made a first observation: these charts are always opinionated, you can (have to) leave words out, forget them, or spell them differently than others.</p><p><a
href="http://ngrams.googlelabs.com/graph?content=pie+chart%2Cline+chart%2Cbar+chart%2Cscatterplot%2Chistogram&amp;year_start=1880&amp;year_end=2000&amp;corpus=0&amp;smoothing=3"><img
class="size-full wp-image-6682 alignnone" title="pie-bar-line-chart" src="http://datavisualization.ch/wp-content/uploads/2010/12/pie-bar-line-chart.png" alt="" width="710" height="260" /></a></p><p>Another comparison I wanted to make was about what the development of communication media looks like over the years. Here, I noticed a difficulty: The Ngram Viewer is case-sensitive, so be careful how you spell &#8220;Internet&#8221;, as there will be fewer results when written in lower-case.</p><p><a
href="http://ngrams.googlelabs.com/graph?content=telegraph%2Ctelephone%2Cphone%2Cfax%2Cemail%2CInternet&amp;year_start=1800&amp;year_end=2008&amp;corpus=0&amp;smoothing=3"><img
class="size-full wp-image-6685 alignnone" title="telegraph-internet" src="http://datavisualization.ch/wp-content/uploads/2010/12/telegraph-internet.png" alt="" width="710" height="260" /></a></p><h3>It&#8217;s About Patterns</h3><p>Many people discover interesting patterns, like the occurrence of year numbers. Seems logical, when you see it, but did you think of this before?</p><p><a
href="http://ngrams.tumblr.com/post/2363999671/1900-1910-1920-1930-1940-1950-1960-1970"><img
class="alignnone size-full wp-image-6687" title="year-patterns" src="http://datavisualization.ch/wp-content/uploads/2010/12/year-patterns.png" alt="" width="710" height="260" /></a></p><h3>It&#8217;s About Correlations</h3><p>If you suspect, that one thing could have an influence on another, just go to the website,try out some terms, and see, whether they occur in literature during the same time periods. This, of course, is not a definite answer, but it&#8217;s a good start to investigate.</p><p><a
href="http://ngrams.tumblr.com/post/2362972889/inflation-unemployment-english-by"><img
class="alignnone size-full wp-image-6688" title="inflation-unemployment" src="http://datavisualization.ch/wp-content/uploads/2010/12/inflation-unemployment.png" alt="" width="710" height="260" /></a></p><h3>It&#8217;s About Phrases</h3><p>The term &#8220;<a
href="http://en.wikipedia.org/wiki/N-gram">n-gram</a>&#8221; describes words (or characters) that occur in a specific sequence. The Google data is available for n-grams of up to 5 words, which means that it is possible to not only search for single words, but for phrases and sayings.</p><p><a
href="http://ngrams.googlelabs.com/graph?content=we+have+a+problem%2Cwe+have+a+solution&amp;year_start=1800&amp;year_end=2000&amp;corpus=0&amp;smoothing=3"><img
class="alignnone size-full wp-image-6690" title="problem-solution" src="http://datavisualization.ch/wp-content/uploads/2010/12/problem-solution.png" alt="" width="710" height="260" /></a></p><h3>It&#8217;s About Language</h3><p>Because the data repository goes back to the 17th century, this tool can give us an interesting insight into the development of languages, like in the visualization below, that shows how the <a
href="http://en.wikipedia.org/wiki/Long_s">medial s</a> (ſ) was superseded by the &#8220;normal&#8221; s. When looking for insights using this tool, always be aware that words may have been written differently, centuries ago, so they may not show up, if you don&#8217;t know what to look for.</p><p><a
href="http://ngrams.tumblr.com/post/2345489273/when-the-long-s-fell-out-of-use-beft-best"><img
class="alignnone size-full wp-image-6694" title="medial-s" src="http://datavisualization.ch/wp-content/uploads/2010/12/medial-s.png" alt="" width="710" height="260" /></a></p><h3>It&#8217;s About History</h3><p>Books reflect the history of the world, so I queried the Ngram Viewer for &#8220;guerre&#8221; (which is French for &#8220;war&#8221;), a (sadly) omnipresent event of human history. I did the query in French, because a lot of historic wars happened there, and it shows indeed: the French Revolution in 1789–1799, the Napoleonic Wars (1792–1815), the Franco-Prussian War (1870), and then, of course, the two World Wars. If you do the same query in American English, you&#8217;ll also notice a strong bump in the 1970s, the Vietnam War, which didn&#8217;t have the same impact on France as it did on the USA.</p><p>I also made a query for &#8220;baïonette&#8221; (bayonet), a tool of war, and indeed, it correlates with the wars, and we also see, when it became available, and that it&#8217;s less used today (I guess that it still shows up because it&#8217;s written about in history books).</p><p>This shows another interesting use case for the Ngram Viewer: let a teacher ask her students &#8220;what do you see?&#8221; They&#8217;ll (hopefully) know about the two World Wars, but then they&#8217;ll have to go and do some research about what those earlier spikes might mean.</p><p><a
href="http://ngrams.googlelabs.com/graph?content=guerre&amp;year_start=1700&amp;year_end=2008&amp;corpus=7&amp;smoothing=4"><img
class="alignnone size-full wp-image-6698" title="guerre" src="http://datavisualization.ch/wp-content/uploads/2010/12/guerre.png" alt="" width="710" height="404" /></a></p><h3>It&#8217;s About Society</h3><p>A last example that I want to go into, is one, that isn&#8217;t possible with the current version of the Ngram Viewer: the comparison of societal change within different language areas. I supposed, that &#8220;racism&#8221; would have  had different impacts in different regions of the world, the USA specifically. And indeed, when we superimpose queries in American English, British English, German and French using Photoshop (be sure to adjust the percentage scales correctly), we can see the bump in the late Sixties in American, but not in British literature. Also interesting is the development in France, which is strangely linear, and different from all the others.</p><p><img
class="alignnone size-full wp-image-6700" title="racism" src="http://datavisualization.ch/wp-content/uploads/2010/12/racism.png" alt="" width="710" height="260" /></p><h2>Conclusion</h2><p>I hope you had as much fun and insights as I had while researching this article. I strongly believe, that by making a visual viewer available for this huge data set, Google did a lot of people a great service, who wouldn&#8217;t otherwise have a chance to dig into this data at all.</p><p>So, go <a
href="http://ngrams.googlelabs.com/graph?content=do+it+yourself&amp;year_start=1800&amp;year_end=2000&amp;corpus=0&amp;smoothing=3">try the tool yourself</a> and post interesting queries in the comments or to the <a
href="http://ngrams.tumblr.com/">Ngrams Tumblelog</a>. Also be sure to read <a
href="http://ngrams.googlelabs.com/info">Google&#8217;s introduction</a> to the Ngram Viewer, which has some interesting background information. And don&#8217;t forget, that you can click the links at the bottom of the charts, which will take you to the sources in the huge repository of books, that Google has digitized.</p> ]]></content:encoded> <wfw:commentRss>http://datavisualization.ch/tools/the-google-books-ngram-viewer/feed/</wfw:commentRss> <slash:comments>13</slash:comments> </item> <item><title>10 Ways How Data Is Changing Our Lives</title><link>http://datavisualization.ch/notes/10-ways-how-data-is-changing-our-lifes/</link> <comments>http://datavisualization.ch/notes/10-ways-how-data-is-changing-our-lifes/#comments</comments> <pubDate>Tue, 31 Aug 2010 06:37:28 +0000</pubDate> <dc:creator>Benjamin Wiederkehr</dc:creator> <category><![CDATA[Notes]]></category> <category><![CDATA[Collection]]></category> <category><![CDATA[DataMining]]></category><guid
isPermaLink="false">http://datavisualization.ch/?p=6097</guid> <description><![CDATA[Conrad Quilty-Harper has written an article for Telegraph.co.uk about how data is changing how we live. It's a list of 10 real-world examples in the fields of Shopping, Relationships, Business deliveries, Maps, Education, Politics, Society, War and Advertising.]]></description> <content:encoded><![CDATA[<a
href='http://datavisualization.ch/notes/10-ways-how-data-is-changing-our-lifes/' title='10 Ways How Data Is Changing Our Lives' class='share_image'><img
src='http://datavisualization.ch/wp-content/uploads/2010/08/how_data_is_changing_our_lives_01.png' title='10 Ways How Data Is Changing Our Lives' alt='10 Ways How Data Is Changing Our Lives' /></a><p>Conrad Quilty-Harper has written <a
title="10 ways data is changing how we live" href="http://www.telegraph.co.uk/technology/7963311/10-ways-data-is-changing-how-we-live.html">an article</a> for Telegraph.co.uk about how data is changing how we live. It&#8217;s a list of 10 real-world examples in the fields of Shopping, Relationships, Business deliveries, Maps, Education, Politics, Society, War and Advertising.</p><p>Although the article is rather focused on the benefits of data mining for companies than for humans there are examples that show fundamental shifts in our society.</p><p>As one of the commenters points out there are other domains that make good use of data collection and analysis like <a
title="Ambient Life – Animated Vision of the Future" href="http://datavisualization.ch/showcases/ambient-life">Health Care</a>, <a
title="Ushahidi helps crowdsourcing crisis information" href="http://datavisualization.ch/showcases/ushahidi-helps-crowdsourcing-crisis-information">Crisis Information Management</a>, <a
title="Capture Pollution, Congestion and Road Conditions with Your Bike" href="http://datavisualization.ch/showcases/capture-pollution-congestion-and-road-conditions-with-your-bike">Urban Sensing</a>. It&#8217;s a good read — I just wished Quilty-Harper would have gone in more detail about possible implications in privacy and security.</p> ]]></content:encoded> <wfw:commentRss>http://datavisualization.ch/notes/10-ways-how-data-is-changing-our-lifes/feed/</wfw:commentRss> <slash:comments>3</slash:comments> </item> <item><title>Tim Berners-Lee on Open &amp; Linked Data</title><link>http://datavisualization.ch/events/tim-berners-lee-on-open-linked-data/</link> <comments>http://datavisualization.ch/events/tim-berners-lee-on-open-linked-data/#comments</comments> <pubDate>Mon, 05 Apr 2010 07:08:20 +0000</pubDate> <dc:creator>Benjamin Wiederkehr</dc:creator> <category><![CDATA[Events]]></category> <category><![CDATA[Conference]]></category> <category><![CDATA[DataMining]]></category> <category><![CDATA[Internet]]></category><guid
isPermaLink="false">http://datavisualization.ch/?p=5044</guid> <description><![CDATA[In 2010 Tim Berners-Lee returned to TED to follow up his great speech about open and linked data from 2009. At this years conference Sir Berners-Lee presented some of the results that sprouted from open data.]]></description> <content:encoded><![CDATA[<a
href='http://datavisualization.ch/events/tim-berners-lee-on-open-linked-data/' title='Tim Berners-Lee on Open &#038; Linked Data' class='share_image'><img
src='http://datavisualization.ch/wp-content/uploads/2010/04/tbl_open_data_01.png' title='Tim Berners-Lee on Open &#038; Linked Data' alt='Tim Berners-Lee on Open &#038; Linked Data' /></a><p>In 2010 <a
href="http://www.w3.org/People/Berners-Lee/" target="_blank">Tim Berners-Lee</a> returned to TED to follow up <a
title="Tim Berners-Lee on the next Web" href="http://www.ted.com/talks/tim_berners_lee_on_the_next_web.html" target="_blank">his great speech</a> about open and linked data from 2009. At this years conference <a
title="Tim Berners-Lee: The year open data went worldwide" href="http://www.ted.com/talks/tim_berners_lee_the_year_open_data_went_worldwide.html" target="_blank">Sir Berners-Lee presented</a> some of the results that sprouted from open data.</p><p>As usual Tims presentation is very passionate about the subject and he shares his vision of the future of the web in an understandable manner. If you haven&#8217;t already seen the videos of the presentation I recommend doing so.</p><h3>2009</h3><p><a
href="http://www.youtube.com/watch?v=OM6XIICm_qo&#038;fmt=18">http://www.youtube.com/watch?v=OM6XIICm_qo</a></p><h3>2010</h3><p><a
href="http://www.youtube.com/watch?v=3YcZ3Zqk0a8&#038;fmt=18">http://www.youtube.com/watch?v=3YcZ3Zqk0a8</a></p><p>The slides of the presentation are available <a
title="Tim Berners-Lee on the next Web Slides" href="http://www.w3.org/2009/Talks/0204-ted-tbl/#%281%29" target="_blank">here</a>.</p> ]]></content:encoded> <wfw:commentRss>http://datavisualization.ch/events/tim-berners-lee-on-open-linked-data/feed/</wfw:commentRss> <slash:comments>2</slash:comments> </item> <item><title>Google: Zeitgeist 2009</title><link>http://datavisualization.ch/datasets/google-zeitgeist-2009/</link> <comments>http://datavisualization.ch/datasets/google-zeitgeist-2009/#comments</comments> <pubDate>Wed, 02 Dec 2009 21:00:03 +0000</pubDate> <dc:creator>Benjamin Wiederkehr</dc:creator> <category><![CDATA[Datasets]]></category> <category><![CDATA[DataMining]]></category> <category><![CDATA[Google]]></category> <category><![CDATA[LineChart]]></category><guid
isPermaLink="false">http://datavisualization.ch/?p=4145</guid> <description><![CDATA[The year 2009 comes to an end and a lot, really a lot of queries have been gone through the google search box. The kind folks at Google take a look back at the happenings throughout this year. They do this as anyone would expect them to: collecting data!]]></description> <content:encoded><![CDATA[<p>The year 2009 comes to an end and a lot, really a lot of queries have been gone through the google search box. The kind folks at Google take a look back at the happenings throughout this year. They do this as anyone would expect them to: <strong>collecting data</strong>!</p><p><a
title="Google Zeitgeist 2009" href="http://www.google.com/intl/en/press/zeitgeist2009/index.html" target="_blank"><img
class="border alignnone size-full wp-image-4148" title="chart_switzerland" src="http://www.datavisualization.ch/wp-content/uploads/2009/12/chart_switzerland.png" alt="chart_switzerland" width="702" height="320" /></a></p><p><a
href="http://www.google.com/intl/en/press/zeitgeist2009/index.html" target="_blank">Google Zeitgeist</a> are a set of lists ranking keywords by popularity from 1 to 10. Fastest rising and fasted falling are collections of keywords with big differences from 2008 to 2009. Right now there&#8217;s only the data as written text, but this could quickly be transformed into a visualization.</p><blockquote><p>To compile the 2009 Year-End Zeitgeist, we studied the             aggregation of billions of queries people typed into             Google search this year. We use data from multiple             sources, including Insights for Search, Google Trends and             internal data tools. We also filter out spam and repeat             queries to build out lists that best reflect &#8220;the spirit             of the times.&#8221;</p></blockquote><p>So get your tools ready and why not surprize us with your vision how this data could be best visualized.</p> ]]></content:encoded> <wfw:commentRss>http://datavisualization.ch/datasets/google-zeitgeist-2009/feed/</wfw:commentRss> <slash:comments>0</slash:comments> </item> <item><title>Visualize Your SSH &amp; FTP Behaviour</title><link>http://datavisualization.ch/showcases/visualize-your-ssh-ftp-behaviour/</link> <comments>http://datavisualization.ch/showcases/visualize-your-ssh-ftp-behaviour/#comments</comments> <pubDate>Tue, 06 Oct 2009 23:24:18 +0000</pubDate> <dc:creator>Benjamin Wiederkehr</dc:creator> <category><![CDATA[Showcases]]></category> <category><![CDATA[DataMining]]></category> <category><![CDATA[JavaScript]]></category> <category><![CDATA[Web]]></category><guid
isPermaLink="false">http://datavisualization.ch/?p=3602</guid> <description><![CDATA[Every time that you login into your local Unix-like machine or a remote hosting server through a FTP client to upload a file or use SSH to get your stuff done, you’re leaving behind a trail of evidence showing your online behaviour. This visualization tool unveils and visualizes this hidden data for you.]]></description> <content:encoded><![CDATA[<p>Every time that you login into your local Unix-like machine or a remote hosting server through a FTP client to upload a file or use SSH to get your stuff done, you’re leaving behind a trail of evidence showing your online behaviour: where and when you log in, how often and how long your online sessions are, in short: your modus operandi. This visualization tool unveils this hidden data, which is gathered by running a few builtin UNIX commands and is analyzed onsite.</p><p><img
class="alignnone size-full wp-image-3600" title="log_sessions_01" src="http://www.datavisualization.ch/wp-content/uploads/2009/10/log_sessions_01.png" alt="log_sessions_01" width="710" height="250" /><img
class="alignnone size-full wp-image-3601" title="log_sessions_02" src="http://www.datavisualization.ch/wp-content/uploads/2009/10/log_sessions_02.png" alt="log_sessions_02" width="710" height="325" /></p><p>The project started as a small personal stats program, but recently made publicly available on the web. Beside static images generated by the script the visualization is a collection of interactive applets using Javascript. For further analysis the raw data is downloadable CSV, XML, JSON or the full Logfile.</p><p><span
class="read_on source">See an <a
title="FTP &amp; SSH sessions visualization" href="http://www.smallmeans.com/data-visualizations/sessions/?4abb6cd65ccd8" target="_blank">example analysis</a> or <a
title="Crunching numbers, so you don’t have to " href="http://www.smallmeans.com/data-visualizations/sessions/diy/" target="_blank">try it now</a> for yourself</span></p> ]]></content:encoded> <wfw:commentRss>http://datavisualization.ch/showcases/visualize-your-ssh-ftp-behaviour/feed/</wfw:commentRss> <slash:comments>1</slash:comments> </item> <item><title>Infochimps let&#8217;s users share their data</title><link>http://datavisualization.ch/datasets/infochimps-lets-users-share-their-data/</link> <comments>http://datavisualization.ch/datasets/infochimps-lets-users-share-their-data/#comments</comments> <pubDate>Wed, 30 Sep 2009 05:00:55 +0000</pubDate> <dc:creator>Benjamin Wiederkehr</dc:creator> <category><![CDATA[Datasets]]></category> <category><![CDATA[DataMining]]></category> <category><![CDATA[no-image]]></category><guid
isPermaLink="false">http://datavisualization.ch/?p=3555</guid> <description><![CDATA[Infochimps.org is an online repository for raw data that's been around for more than one year now. At DEMOfall 09 they made the announcement that they extended the website's functionality. Infochimps.org now let's users share and even sell their own datasets.]]></description> <content:encoded><![CDATA[<p><a
title="Infochimps lets you discover, share and sell data of any size, topic, or format. " href="http://infochimps.org/" target="_blank"><img
class="" title="Infochimps.org logo" src="http://www.datavisualization.ch/wp-content/uploads/2009/09/main_logo.png" alt="Infochimps.org logo" width="100" height="100" />Infochimps.org</a> is an online repository for raw data that&#8217;s been around for more than one year now. At <a
href="http://www.demo.com/">DEMOfall 09</a> they made the announcement about the extended functionality of the website. Infochimps.org now let&#8217;s users <a
href="http://infochimps.org/share" target="_blank">share</a> and even <a
href="http://infochimps.org/sell" target="_blank">sell</a> their own datasets. Beside beeing the best search engine for datasets Infochimps.org could become the strongest if not only real marketplace for raw data (alongside <a
href="http://aws.amazon.com/publicdatasets/">Amazon Public Datasets</a>, <a
href="http://www.datavisualization.ch/datasets/us-government-data-on-datagov">Data.gov</a>, <a
href="http://www.datavisualization.ch/datasets/ogdi">OGDI</a>, <a
href="http://www.datavisualization.ch/datasets/socrata-%E2%80%93-a-social-network-for-data">Socrata</a> and the likes).</p><p>With the provided service Infochimps.org encourages companies to open up their datasets. A secure place to host, share and sell data could ignite a more open culture with data. We are looking foreward to the progress that hopefully follows this announcement.</p> ]]></content:encoded> <wfw:commentRss>http://datavisualization.ch/datasets/infochimps-lets-users-share-their-data/feed/</wfw:commentRss> <slash:comments>0</slash:comments> </item> <item><title>Personas: Uncanny results about your persona</title><link>http://datavisualization.ch/showcases/personas-uncanny-results-about-your-persona/</link> <comments>http://datavisualization.ch/showcases/personas-uncanny-results-about-your-persona/#comments</comments> <pubDate>Wed, 19 Aug 2009 22:29:57 +0000</pubDate> <dc:creator>Benjamin Wiederkehr</dc:creator> <category><![CDATA[Showcases]]></category> <category><![CDATA[DataMining]]></category> <category><![CDATA[Internet]]></category> <category><![CDATA[Social]]></category> <category><![CDATA[Taxonomy]]></category><guid
isPermaLink="false">http://datavisualization.ch/?p=2939</guid> <description><![CDATA[Aaron Zinman, PhD student in the Social Media Group at MIT has recently published his installation for the Connections exhibit at the MIT Museum. The project entitled "Personas" gives you insights in how the Internet sees "You".]]></description> <content:encoded><![CDATA[<p><a
href="http://web.media.mit.edu/~azinman/">Aaron Zinman</a>, PhD student in the <a
href="http://smg.media.mit.edu/">Social Media Group</a> at MIT has recently published his installation for the <a
href="http://web.mit.edu/museum/exhibitions/connections/">Connections exhibit</a> at the MIT Museum. The project entitled &#8220;<a
href="http://personas.media.mit.edu/"><strong>Personas</strong></a>&#8221; delivers an insight into how the Internet sees <strong>You</strong>.</p><blockquote><p>Personas uses sophisticated natural language processing  and the Internet to create a data portrait of one&#8217;s aggregated online identity.</p></blockquote><p>The user enters his first- and lastname to start the analysis. The machine then spiders through the web and gathers as much information as possible for the user&#8217;s name and tries to <strong>categorize</strong> the user&#8217;s appearances. Finally the application creates a bar divided in multiple parts each representing another category by a different color. The creation of this bar happens in real-time as the machine searches for information which is a nice effect.</p><p><img
class="alignnone size-full wp-image-2937" title="personas_01" src="http://www.datavisualization.ch/wp-content/uploads/2009/08/personas_01.png" alt="personas_01" width="710" height="162" /><img
class="alignnone size-full wp-image-2938" title="personas_02" src="http://www.datavisualization.ch/wp-content/uploads/2009/08/personas_02.png" alt="personas_02" width="710" height="162" /></p><p>The most disturbing thing about Personas is the fact that with each iteration in ran my name through it, the result were different. A possible explanation for this can be read in the official description:</p><blockquote><p>&#8220;Personas demonstrates the computer&#8217;s <strong>uncanny</strong> insights and its inadvertent errors, such as the mischaracterizations caused by the inability to separate data from multiple owners of the same name&#8221;</p></blockquote><p>Flickr.com already features an <a
title="Personas on Flickr.com" href="http://www.flickr.com/search/?q=personas.media.mit.edu&amp;w=all&amp;s=int#page=0">extensive collection</a> of persona-profiles and new entries coming in constantly – a nice pastime if you got a lazy minute.</p><p>Already tried Personas with <strong>your name</strong>? What were your experiences with it&#8217;s accuracy (or the lack there off)?</p><p><span
class="source read_on">Via <a
href="http://www.infosthetics.com/" target="_blank">InformationAesthetics</a></span></p> ]]></content:encoded> <wfw:commentRss>http://datavisualization.ch/showcases/personas-uncanny-results-about-your-persona/feed/</wfw:commentRss> <slash:comments>5</slash:comments> </item> <item><title>How to write data from the web into a CSV file</title><link>http://datavisualization.ch/notes/how-to-write-data-from-the-web-into-a-csv-file/</link> <comments>http://datavisualization.ch/notes/how-to-write-data-from-the-web-into-a-csv-file/#comments</comments> <pubDate>Wed, 15 Jul 2009 05:48:49 +0000</pubDate> <dc:creator>Benjamin Wiederkehr</dc:creator> <category><![CDATA[Notes]]></category> <category><![CDATA[DataMining]]></category> <category><![CDATA[Python]]></category><guid
isPermaLink="false">http://datavisualization.ch/?p=2615</guid> <description><![CDATA[Michael Bommarito from Computational Leagal Studies shows one way of how to use the web as a database. He has published a neat piece of Python code to collect data from the web and write in a single CSV file.]]></description> <content:encoded><![CDATA[<p><a
href="http://www-personal.umich.edu/~mjbommar/">Michael Bommarito</a> from <a
href="http://computationallegalstudies.com/2009/07/01/how-python-can-turn-the-internet-into-your-dataset-part-1/">Computational Leagal Studies</a> shows on way on how to use the web as a database. He has published a neat piece of Python code to collect data from the web and write in a single CSV file.</p><p><span
class="read_on source">Read full tutorial on <a
href="http://computationallegalstudies.com/2009/07/01/how-python-can-turn-the-internet-into-your-dataset-part-1/">Computational Leagal Studies</a></span></p> ]]></content:encoded> <wfw:commentRss>http://datavisualization.ch/notes/how-to-write-data-from-the-web-into-a-csv-file/feed/</wfw:commentRss> <slash:comments>0</slash:comments> </item> </channel> </rss>
<!-- Performance optimized by W3 Total Cache. Learn more: http://www.w3-edge.com/wordpress-plugins/

Minified using disk: basic
Page Caching using disk: enhanced (User agent is rejected)
Database Caching 23/28 queries in 0.050 seconds using disk: basic

Served from: datavisualization.ch @ 2012-05-23 12:50:39 -->
