Sunday, 29 June 2008

Tag clouds for analysing documents

CV not getting you those all important interviews? Nobody answering your job advert? Or perhaps your corporate publicity is not doing the biz? Processing your document through a tag cloud generator might give you a clue as to where you are going wrong. Sue Hill gave a presentation at the recent City Information Group open day on CPD and skills. In passing she mentioned that they sometimes run a CV or job description through a tag cloud generator to show people why their lovingly created prose is way off the mark. The tag cloud brings to the fore your most used terms and it can be a shock to discover that you have placed the emphasis in totally the wrong area. It then struck me that you could do this with any form of literature - a web page, training publicity, membership recruitment forms.

There are dozens, if not hundreds, of tag cloud generators on the Web and most of them are free. For starters try Wordle, Tagcrowd, or Tag Cloud Generator. The example below is a tag cloud of the UKeiG home page generated by Wordle.


Tuesday, 17 June 2008

Top Search Tips

I ran another advanced search workshop (Google and Beyond) for UKeiG on June 11th, this time in London. Twenty people attended the event and came up with the following list of top search tips at the end of the day.

1. Use the Advanced Search screen. There are lots of goodies to be found on the advanced search screens: options for focussing your search by file format (e.g. xls for data and statistics, ppt for expert presentations, pdf for industry or government reports); site and domain search to limit your search to just one web site or a type of organisation (e.g. UK government, US academic); and in Google there is a numeric range search.

2. Google Custom Search Engines (Google CSE) at http://www.google.com/coop/cse/. This made its first appearance in the Top Tips from the Liverpool workshop earlier this year. Ideal for building collections of sites that you regularly search, to create a searchable subject list, or to offer your users a more focused search option.

3. See what Google does with your search string.

a) If you use the default search box and Google comes back with odd results, click on Advanced Search to see what it has done with your search terms.

b) If you use the Advanced Search screen and fill in the boxes, see how Google formats the search strategy by looking the search box at the top of the results page. By learning the commands and prefixes you can build more specific searches more quickly on the default search page.



4. Cached copies. Look at the search engines cached copy of a web page if you can't find your search terms in the document or if the page is nothing like the description in the results list. You will see the version of the page that has been used by the search engine for indexing and with your terms highlighted.

5. Use tools such as Intelways and Zuula for quick and easy access to a wide range of search tools covering different types of information. Enter your search once, click on the tab for the type of resource for which you are searching (video, images, reference, news etc.), and then work your way through the list of search engines.

6. Alacrawiki. The Alacra Spotlights section is a good starting point for evaluated sites and information on industry sectors. It is also a good example of what to look for when assessing the quality of a wiki and how easy it is for anyone to edit the pages. In the Spotlights sections there is no edit option , not even if you register for an account and login. Only the Alacra editors can edit the pages.

7. Open access journals. Google Scholar sometimes leads you to copies of journal articles in institutional repositories and open access journals, but there are also directories of open access journals. For example: http://www.doaj.org/ , http://www.wsis-si.org/oa-journals.html, http://www.abc.chemistry.bsu.by/current/fulltext.htm . This is not my area of expertise so comments on other directories are welcome.

8. Social bookmarking sites. Try social bookmarking sites, not only for creating your evaluated lists of sites but for searching other peoples. For example FURL, Del.icio.us, Connotea, 2Collab . Connotea (owned by the Nature Publishing Group) and 2Collab (owned by Elsevier) are aimed at researchers and scientists.

9. Search results visualisation. Try out some of the newer search tools that present results and search options in a different way. For example Cluuz, Kartoo, Kvisu, Quintura. [Some of the participants specifically mentioned Cluuz and Kvisu].

10. The Internet Archive (Wayback Machine) at http://www.archive.org/ for pages, sites and documents that have disappeared. Ideal for tracking down lost documents, seeing how organisations presented themselves on the Web in the past, and for collecting evidence for a legal case (e.g. 'passing off', copyright infringement).

Wednesday, 11 June 2008

Energy Export Databrowser

The Energy Export Databrowser, set up Jonathan Callahan, is based on BP's 2007 Statistical Review and provides a quick and easy way to view country data on consumption, import and export of crude oil and natural gas. It covers over 80 countries and data goes back to the 1960s. There is feedback on the browser itself and an interesting discussion on the accuracy and validity of the underlying data on The Oildrum.


Wednesday, 4 June 2008

Directories: Major Companies of the World 2008

Seven new Editions of the World's Major Companies Series have been published by Graham & Whiteside and are now available for purchase on the dataresources web site.

Major Chemical and Petrochemical Companies of the World 2008
This directory covers more than 7,000 of the leading chemical and petrochemical companies worldwide.

Major Energy Companies of the World 2008
More than 4,000 companies involved in coal mining and coal products; electricity supply; fuel distribution; natural gas supply; nuclear engineering; oil and gas exploration and production; oil and gas services and equipment; and oil refining worldwide.

Major Financial Institutions of the World 2008 (2 Vols)
Over 9,000 leading financial institutions worldwide, including banks, investment, insurance and leasing companies.

Major Food and Drink Companies of the World 2008
9,800 of the leading food, alcoholic and non-alcoholic drink companies worldwide.

Major Information Technology Companies of the World 2008
This directory covers more than 3,100 of the leading information technology companies worldwide.

Major Pharmaceutical and Biotechnology Companies of the World 2008
The world's largest pharmaceutical companies, providing essential business profiles of the international leaders in the industry.

Major Telecommunications Companies of the World 2008
Profiles of more than 3,500 of the leading telecommunications companies worldwide, including many of the top Internet companies.

Wednesday, 28 May 2008

INFORUM starts in Prague today

The 14th annual INFORUM conference starts today in Prague at 10.30 Prague time. INFORUM covers professional electronic information resources for research, development, education and business purposes. If you are not able to attend the event in person live video broadcasting of the sessions being run in the New Auditorium will be available at www.ikaros.cz. The programme of the event is at http://www.inforum.cz/en/programme/.

I shall be twittering some of the sessions (Twitter name karenblakeman) and I am sure there will be others. Unfortunately, because Twitter is "stressing out a bit" at the moment you can only view one page of tweets. The 'Older' option has been temporarily suspended, which is very annoying if you are trying to follow conference tweets. My own tweets are recorded daily by LoudTwitter at http://karenblakeman.livejoural.com/.

Sunday, 25 May 2008

Workshop: Effective Use of Web 2.0 in Business

If you were not able to attend my recent workshops on Web 2.0, I am running a similar course at Manchester Business School on June 5th. The workshop will start with a brief overview of Web 2.0 and what it means, then look in more detail at the different applications. As is usual with my workshops there is a substantial practical element so that you can try out the technologies for yourself. Details and a booking form are at www.mbs.ac.uk/bis-training or you can call the Business Information Service on 0161 275 6503.

Academic Live and Live Books axed

I did a double take when I scanned through my RSS feeds this morning. Live Search have announced that they are closing down Academic Live and Live Books Search. Surely a late report of an April Fool, I thought. Unfortunately it was a genuine posting on Live Search's official blog. Both sites will be taken down this week and they are winding down their digitization initiatives, including their library scanning and their in-copyright book programs.

I have tried to support Live.com and promote it to those who attend my workshops as a viable alternative to Google. In my experience, it seems to have the most up to date database, often finds pages and documents that the other search engines miss, and has a great command for locating RSS feeds on a web site. But it keeps shooting itself in the foot. The site recently had a makeover, but the presentation of the advanced search is still awful and the only reliable way of using the options is via the command line. Live News has improved greatly and now has an RSS alert option, but only in the US version of Live. See my earlier posting Live.com updates news interface - but only for the US. And it had by far the best link and linkdomain commands but disabled those because of mass automated data mining.

Both Live Books and Academic Live were superior to Google's offerings. They had different coverage but the advanced search options, for example date and author search, actually worked in Live, and Academic Live had options for exporting records to RefWorks and EndNote, albeit one by one. Live goes on to say in its announcement that books and scholarly publications "will continue to be integrated into our Search results, but not through separate indexes." Sorry, but not good enough. That will work fine if you know exactly what you are looking for and it is a very narrowly focussed search, for example I can easily find my husband's papers on ESR studies of zeolites, but it is impossible to limit a search to books or peer reviewed papers on a more general topic.

It seems that this part of the market does not make enough money for Live and it says that it will now "focus on verticals with high commercial intent, such as travel, and offer users cash back on their purchases from our advertisers." Bribery appears to be part of the new company policy: another headline in my morning feed update reads "Office 2007 plus petrol: Microsoft Australia is trying to lure Aussies to buy Office 2007 with petrol"!

Forget about self-inflicted metatarsal wounds, I am beginning to suspect that Live has a serious death wish. I wonder what will be the next part of Live to go?