Friday 25 January 2013

Zanran - great for data in tables, charts and graphs

I regularly mention Zanran (http://www.zanran.com/) in my workshops on search and business information, and it often finds its way into the Top Tips compiled by the delegates at the end of the day.
Zanran is not a Google alternative. Rather than search the text of web pages it extracts and indexes numerical data presented as tables, charts and images in PDF reports, spreadsheets and ordinary web pages. You can simply type in your search terms but there are additional options for narrowing down the search by location of the web server, specifying an individual site, selecting a time period and limiting by file type.

The results page lists the files it has found with an extract highlighting the content containing your terms. In this example I am looking for data on agricultural methane emissions in the UK.


To the left of each entry is a thumbnail. Moving the cursor over the thumbnail brings up a preview of the page containing the relevant chart, table or image. This enables you to immediately assess the relevance of the data without having to download and go through a lengthy document.


If you click on the thumbnail or the title to view the whole document you have to register (free of charge) as copies of the indexed documents are stored by Zanran. If you prefer to go to the original document click on the URL button attached to the summary of the page and click on the link that is then revealed. Unfortunately, you may see “page not found” especially if it is on a UK government department web site. Many of these have now been closed and their content archived making it difficult to track them down. Registering with Zanran is by far the easier option. Also, rather than deluge you with documents from a single site, as Google all too often does, Zanran gives you a link telling you if and how many other results are available on a site.

How does it compare with Google? Well, Google did come up with relevant results for my search but I had to spend a lot of time ploughing through them to identify the best documents. And Google did not pull up in the first 100 results the very useful archived UK government documents that Zanran gave me.



If you are looking for data or statistics Google still does a very good job but I recommend you also run a  search in Zanran. It may well come up with a real gem, as it often has for me.


Tuesday 8 January 2013

EU launches public beta of its open data portal

The EU has launched a public beta of its new open data portal at http://open-data.europa.eu/open-data/. Open data is information that can be freely used, re-used and redistributed by anyone. The EU portal covers all the information that public bodies in the European Union produce, collect or pay for. At present it has 5,811 datasets of which 5,634 come from Eurostat, the statistical office of the EU.

You can search the datasets by keyword and refine your results using the keywords and publishers listed on the right of the screen.


Alternatively there are options for browsing the datasets using tags and keywords. This may be easier if you are not sure of what terms to use.


Using the tags also seems to be more reliable. A search on coal production gave me one relevant dataset but the rest of the results only had production as a keyword. I was seeing sets for carrot production, production of butter, sunflower production etc. I assume that 'coal' had been dropped because there were so few results containing both terms. Searching on just coal reduced the number of results from around 5000 to 7, one of which was highly relevant (Primary production of coal and lignite). The other 6 covered energy production in general including coal. Browsing and narrowing down the sets using the tags does seem to be the best way of navigating the data at the moment.

Once you have identified a relevant dataset additional information such as time span and date last modified are provided together with links for downloading the data.


It's then up to you to find a way of viewing and analysing the data!