Thursday 29 December 2016

Seasonal opening times - never trust Google's answers (or Bing's)

This is my usual Christmas/New Year reminder to never trust Google's answers (or Bing's) on opening times of shops over the holiday season, especially if you are thinking of visiting small, local, independent shops.

I was contemplating going to our True Food Co-operative but suspected that it might still be shut. A search on my laptop for True Food Emmer Green opening times gave me a link to their website at the top of the results list. On the right hand side was a knowledge graph with information on the shop, it's opening times and reviews that had been compiled from a variety of sources . For most of it the source of the information is not given.  On my mobile and tablet it is the knowledge graph that appears at the top of the results list and  takes up the first couple of screens.



It claims that the shop is "Open today 10am-6pm" [today is Thursday, 29th December].

When I go to True Food's website it clearly states near the top of the home page that they are currently closed and re-opening on 4th January 2017.



Google gets it wrong again in the knowledge graph but so does Bing. So, always check the shop's own website, and if you are searching on your mobile or tablet please make the effort to scroll down a couple of screens to get to links to more reliable information.

Tuesday 15 November 2016

How to write totally misleading headlines for social media

Or how to seriously annoy intelligent people by telling deliberate lies.

A story about renewable energy has been doing the rounds within my social media circles,  and especially on FaceBook. It is an article from The Independent newspaper that has been eagerly shared by those with an interest in the subject.  The headline reads "Britain just managed to run entirely on renewable energy for six days".

This is what it looks like on FaceBook:

britain_entriely_run_renewable_energy_1

My first thought was that, obviously, this was complete nonsense. Had all of the petrol and diesel powered cars in Britain been miraculously converted to electric and hundreds of charging points installed overnight? I think that we would have noticed, or perhaps I am living in a parallel universe where such things have not yet happened.  So I assumed that the writer of the article, or the sub-editor,  had done what some journalists are prone to do, which is to use the terms energy and electricity interchangeably. Even if they meant "electricity"  I still found the claim that all of our electricity had been generated from renewable sources for six days difficult to believe.

Look below the  headline and you will see that the first sentence says "More than half of the UK’s electricity has come from low-carbon sources for the first time, a new study has found." That is more like it. Rather than "run entirely on renewable energy" we now have "half of the UK's electricity has come from low-carbon sources" [my emphasis in both quotes]. But why does the title make the claim when straightaway the text tells a different story? And low carbon sources are not necessarily renewable, for example nuclear. As I keep telling people on my workshops, always click through to the original article and read it before you start sharing with your friends.

The title on the source article is very different from the facebook version as is the subtitle.

britain_entriely_run_renewable_energy_2
We now have the title "Half of UK electricity comes from low-carbon sources for first time ever, claims new report", which is possibly more accurate. Note that "renewable" has gone and we have "low carbon sources" instead. Also, the subtitle muddies the waters further by referring to "coal- free".

If you read the article in full it tells you that "electricity from low-emission sources had peaked at 50.2 per cent between July and September" and that happened for nearly six days during the quarter.  So we have half of electricity being generated by "low emission sources" but, again, that does not necessarily equate to renewables. The article does go on to say that the low emission sources included UK nuclear (26 per cent) , imported French nuclear,  biomass, hydro, wind and solar.  Nuclear may be low emission or low carbon but it is not a renewable.

Many of the other newspapers are regurgitating almost identical content that has all the hallmarks of a press release. As usual, hardly any of them give a link to the original report but most do say it is a collaboration between Drax and Imperial College London. If you want to see more details or the full report then you have to head off to your favourite search engine to hunt it down.  It can be found on the Drax Electric Insights webpage. Chunks of the report can be read online (click on Read Reports near the bottom of the homepage) or you can download the whole thing as a PDF. There is also an option on the Electric Insights homepage that enables you to explore the data in more detail.

This just leaves the question as to where the FaceBook version of the headline came from.  I suspected that a separate and very different headline had been specifically written for social media. I tested it by copying the URL and headline of the original article using a Chrome extension and pasted it into FaceBook. Sure enough, the headline automatically changed to the misleading title.

To see exactly what is going on and how, you need to look at the source code of the original article:

britain_entriely_run_renewable_energy_3

Buried in the meta data of page and tagged "og:title" is the headline that is displayed on FaceBook. This is the only place where it appears in the code.  The "og:title" is one of the open graph meta tags that tell FaceBook and other social media platforms what to display when someone shares the content. Thus you can have totally different "headlines" for the web and FaceBook that say completely different things.

Compare "Britain just managed to run entirely on renewable energy for six days" with "Half of UK electricity comes from low-carbon sources for first time ever, claims new report" and you have to admit that the former is more likely to get shared. That is how misinformation spreads. Always, always read articles in full before sharing and, if possible, try and find the original data or report. It is not always easy but we should all have learnt by now that we cannot trust politicians, corporates or the media to give us the facts and tell the full story.

Update: The original press release from DRAX "More than 50% of Britain’s electricity now low carbon according to ground-breaking new report"

Monday 31 October 2016

WebSearch Academy presentations - edited highlights

Edited highlights from the presentations I gave at the WebSearch Academy on 17th October 2016 at the Olympia Conference Centre, London are now available on SlideShare.  They are also available on authorSTREAM. These are selected slides from the presentations; if you attended the event and would like copies of the full sets please contact me.

The presentations are:

New Dimensions in Search: seeing, hearing viewing (takes you to authorSTREAM). Searching for images, video and audio.

WebSearch Academy: If not Google then what? (takes you to authorSTREAM). Looks at alternatives to Google and some specialist tools.

SlideShare options for both are given below.





 

 



Saturday 22 October 2016

Google results: review stars may not refer to what you think they do

The contract for our domestic electricity supply is ending next month so I am trawling through cost comparison and energy supplier websites to check tariffs for our next contract. (UK readers can skip the rest of this explanatory paragraph). I don't know what the situation is in other countries but in the UK the gas and electricity suppliers are forever inventing a variety of tariffs priced significantly less than their "standard" rates to entice you to sign up. The lower priced tariffs are generally only available for a year, or two years at most. At the end of the contract the customer is usually transferred to the more expensive standard rate unless they actively seek out an alternative. The existing supplier is obliged to inform the customer of the new tariffs that will be on offer but the onus is on the customer to inform the company which tariff, if any, they wish to switch to.  For other suppliers' tariffs the customer has to do their own research.

Price comparison sites are a good starting point to identify potential alternatives but the only way to check that the a tariff meets all of your criteria, of which price may be just one of many, is to go direct to the supplier's website. Today I spent most of the morning drawing up the shortlist.

The next step in my strategy was to look at customer reviews on the comparison websites, social media, discussion boards and to run a Google search on each supplier. The reviews and comments generally spanned several years and while the history of a company's customer service performance can be useful it is the last 12-18 months that are most relevant. This is where limiting the search to more recent information by  using Google's date option comes into play. Having spent an hour or so to get this far, and with my brain beginning to wilt, it was tempting to read just the Google snippets for the reviews; but they can convey the wrong overall impression. Google sometimes creates snippets by pulling together text from two or more sections of a page that may be separated by several paragraphs and which may be about completely different products or topics. Never take the snippet at face value and always click through to the original, full article.

One of the energy providers on my short list is Robin Hood Energy, which is a not-for profit company run by Nottingham City Council and has only recently been made available to customers outside of Nottingham.  Customer reviews are therefore less plentiful than for many of the other utilities. The results from a search on

Robin Hood Energy customer reviews

included one from Simply Switch. Underneath the title and URL is a star rating of 4.4 from 221 reviews and one could be forgiven for assuming that this refers to Robin Hood Energy. This is reinforced by the text in the second half of the snippet: "Robin Hood guarantee their customers consistently low prices ... rated 4.4/5 based on 221 reviews".  robin_hood_customer_reviews

The dots are important in that they represent a missing chunk of text between the two pieces of information. When I looked at the web page itself the rating was nowhere to be found in the main body of the text. It was in the footer of the page and referred to the Simply Switch site.

simply_switch_reviews

A reminder, then, to never rely on the snippets for an answer, and always click through and read the whole web page.

Thursday 6 October 2016

Google Blogger loses links and blog lists: what to do next

Google Blogger has done it again. A major update to the service was rolled out at the end of September and many users woke up to find that the links and blog lists they had so carefully created had gone.   See the Blogger Help Forum for some of the postings and comments on the incident.  Blogger engineers are supposedly working to restore the lost information  but it "may take up to several days." Or never! This is not the first time that blog content has gone missing after an update. A few years ago an update somehow removed the most recent posts from people's blogs. Most of them were eventually recovered but a few disappeared without trace.

The lesson learned from that experience was back up your blog. In Blogger the import and backup tool is under Settings, Other and at the top of the page. Note, though that this will only backup the text of pages, posts and comments. It does not backup any changes you have made to the template, or the content of the gadgets in your sidebars such as links lists and blogrolls. For the  template click on Template in the lefthand sidebar and then on Backup/Restore. This will save the general layout of the gadgets but not the content. For that you will need to copy and save the content for each gadget or save a copy of the content and HTML of your blog.  Back up your Blogger blog: photos, posts, template, and gadgets has details of what you need to do.

And don't forget your photos. For those use Google's Takeout service at https://www.google.com/settings/takeout.

If you don't have a copy of your lists of links then see if you can access an older cached version of your blog  via Google or Bing and save the whole page, or take screen shots. If you try this several days after the event you may be out of luck. Mine were still in the cached page for up to 2 days but have now gone. In Google, use the 'cache:' command, for example:

cache:yourblogname.blogspot.com

An alternative is to search for your blog and next to your entry in the results lists there should be a small downward pointing green arrow. Click on it and then on the 'Cached' text to view the page.  This works in both Google and Bing  and, again, the sooner you do this the better.

bing_cached_option

If none of that works then try the Wayback Machine. Type in the URL of your blog and see if they have any snapshots.

wayback_blog

Still no joy? Then either hang around a while longer to see if the Blogger engineers manage to revive your lists or start rebuilding them from scratch. If you haven't looked at them in a while, maybe now is the time to review the content anyway.

Monday 19 September 2016

Essential Non-Google Search Tools for Researchers - Top Tips

This is the list of Top Tips that delegates attending the UKeiG workshop on 7th September 2016 in London came up with at the end of the training day.  Some of the usual suspects such as the 'site:' command, Carrot Search and Offstats are present but it is good to see Yandex included in the list for the first time.

  1. Carrotsearch http://search.carrotsearch.com/carrot2-webapp/search or http://carrotsearch.com/ and click on the “Live Demo” link on the left hand side of the page.
    This was recommended for its clustering of results and also the visualisations of terms and concepts via the circles and “foam tree”. The Web Search uses eTools.ch for the general searches and there is also a PubMed option.

    [caption id="attachment_3731" align="aligncenter" width="400"]Carrot Search Foam PubMed Foam Tree Carrot Search Foam PubMed Foam Tree[/caption]



  1. Advanced Twitter Search http://twitter.com/search-advanced
    The best way to search Twitter! Use the Advanced Search http://twitter.com/search-advanced or the click on the “More Options” on the results page. There is a detailed description of the commands and how they can be used at https://blog.bufferapp.com/twitter-advanced-search 



  1. Yandex http://www.yandex.com/
    The international version of the Russian search engine with a collection of advanced commands - including a proximity operator - that makes it a worthy competitor to Google. Run your search and on the results page click on the two line next to search box.

    [caption id="attachment_3734" align="aligncenter" width="400"]Yandex Advanced Search Yandex Advanced Search[/caption]

    Alternatively, use the search operators. Most of them are listed at https://yandex.com/support/search/how-to-search/search-operators.xml. There is also a /n operator that enables you to specify that words/phrases must appear within a certain distance of each other, for example:

    "University of Birmingham" nanotechnology /2 2020

    There are country versions of Yandex for Russia, Ukraine, Belarus, Kazakhstan and Turkey. You will, though, need to know the languages to get the best out of them and apart from Turkey they use a different alphabet.



  1. Millionshort http://millionshort.com/
    If you are fed up with seeing the same results from Google again and again give MillionShort a try. MillionShort enables you to remove the most popular web sites from the results. The page that best answers your question might not be well optimised for search engines or might cover a topic that is so specialised that it never makes it into the top results in Google or Bing.Originally, as its name suggests, it removed the top 1 million but you can change the number that you want omitted. There are filters to the left of the results enabling you to remove or restrict your results to ecommerce sites, sites with or without advertising, live chat sites and location. The sites that have been excluded are listed to the right of the results.



  1. site: command
    Use the site: command to focus your search on particular types of site, for example include site:ac.uk in your search for UK academic websites. Or use it to search inside large rambling sites with useless navigation, for example site:www.gov.uk. You can also use -site: to exclude individual sites or a type of site from your search. All of the major web search engines support the command.



  1. Microsoft Academic Search http://academic.research.microsoft.com/
    An alternative to Google Scholar.“Semantic search provides you with highly relevant search results from continually refreshed and extensive academic content from over 80 million publications.”This was recently revamped and although it now loads and searches faster than it used to the new version has lost the citation and co-author maps that were so useful. It can be a useful way of identifying researchers, publications and citations but do not rely on the information too much. It can get things very wrong indeed. For example, I’ve found that for some reason the affiliation of several authors from the Slovak Technical University in Bratislava is given as the Technical University of Kenya!



  1. Wolfram Alpha https://www.wolframalpha.com/
    This is very different from the typical search engine in that it uses its own curated data. Whether or not you get an answer from it depends on the type of question and how you ask the question. The information is pulled from its own databases and for many results it is almost impossible to identify the original source, although it does provide a possible list of resources. If you want to see what WolframAlpha can do try out the examples and categories that are listed on its home page.



  1. OFFSTATS - The University of Auckland Library http://www.offstats.auckland.ac.nz/
    This is a great starting point for locating official statistical sources by country, region or subject. All of the content in the database is assessed by humans for quality and authority, and is freely available.



  1. Meltwater IceRocket http://www.icerocket.com/
    IceRocket specialises in real-time search and was recommended for inclusion in the Top Tips for its blog search and advanced search options. There is also a Trends tool that shows you the frequency with which terms are mentioned in blogs over time and which enables you to compare several terms on the same graph.

    [caption id="attachment_3739" align="aligncenter" width="400"]IceRocket Trends IceRocket Trends[/caption]

    Very useful for comparing, for example, mentions of products, companies, people in blogs.



  1. Behind the Headlines NHS Choices http://www.nhs.uk/news/Pages/NewsIndex.aspx
    Behind the headlines provides an unbiased and evidence-based analysis of health stories that make the news. It is a good source of information for confirming or debunking the health/medical claims made by general news reporting services, including the BBC. For each “headline” it summarises in plain English the story, where it came from and who did the research, what kind of research it was, results, researcher’s interpretation, conclusions and whether the headline’s claims are justified.

Thursday 1 September 2016

Don't expect advanced search features to exist forever

A couple of weeks ago I wrote about the problems I was having with Google Verbatim (Google Verbatim on the way out?). This morning I ran through a checklist of commands that I am demonstrating in a webinar and it seems that Verbatim is back working as it should. Don't hold your breath, though. Three times this year I have seen Google Verbatim disappear or do strange things and a couple weeks later return to normal. Verbatim may be here to stay or it may not, but you cannot depend on many advanced search commands to always work as you expect. So either learn different ways of making Google treat your search in the way you require or use a different search engine.

Unfortunately, disappearing or unreliable functionality is not confined to just Google. Bing used to have a very useful proximity command that allowed you to specify how close you wanted your words to be to one another. The "near:n"  operator is still listed in Bing's list of advanced search commands and, although it seems to do something and reduce the number of results, it does not behave as described.

There is also the endangered list such as DuckDuckGo's sort by date option. In fact all of DuckDuckGo's web search options will probably soon change or disappear as it is currently powered by Yahoo! which has been bought by Verizon. Who will DuckDuckGo turn to if Verizon does combine Yahoo with AOL as has been stated in the press?

Get to know several different search tools really well and, for the ones that you use regularly, find out how they work and who provides the search results.

 

Thursday 11 August 2016

Google Verbatim on the way out?

Update: 1st September 2016 - Verbatim seems now to be working as it should. I hope it stays that way but on three occasions this year I have seen it work one day, then not the next and then back to working again.

We have become accustomed to Google rewriting and messing about with our searches, and dropping search features that are infrequently used. The one option that could control most of Google's creative interpretation of our queries was Verbatim but that now looks as though it could be destined for the axe as well.

A reminder of what Verbatim does. If you want to stop Google looking for variations on your terms, ignoring double quote marks around phrases,  or dropping words from the search Verbatim is, or rather was, the quickest way to do it. If you are using a  desktop or a laptop computer,  run your search as normal. On the results page click on 'Search Tools' at the end of the line of options that appears at the top. Then, from the second line of options that should appear,  choose 'All results' followed by Verbatim. The location of Verbatim on other  devices varies.

Verbatim has been invaluable when searching on titles of research papers, legislation or researching topics for which you expect or want to retrieve very few or zero results. You might be researching rare adverse events associated with a pharmaceutical drug or wanting to confirm that what you are about to patent has not  already been published and is out there for all to see. Or the topic is so specific that you only expect to see a  handful of documents, if that. So, sometimes, no or a low number of results is a good thing. But Google does not like zero or small numbers of results and that is when Google's search rewrite goes into overdrive.

I had noticed for a few months that Verbatim was not always working as expected but had hoped it was one of Google's experiments. The problem has not gone away and the really confusing part is that Verbatim is still doing something but not what I would expect.

I was working in Penryn in July and took the opportunity to wander around the place. Inevitably, I googled some of the sites I had seen for further information but one threw up the Verbatim problem.  I was particularly interested in what looked like a memorial but didn't have time to seek out information on site. Looking at the photo afterwards I can where the plaque was (to the right and next to the flagpole) but I missed it on the day.

[caption id="attachment_3714" align="aligncenter" width="300"]Memorial Garden Penryn The memorial and garden commemorates 18 residents of Penryn who were killed during an air raid in May 1941.[/caption]

I did see a sign on the wall surrounding the area, though, telling me that it was "two" on the Penryn Heritage Trail.

Penryn Heritage Trail

A quick, basic search told me that it is called the Memorial Garden but I wanted to find out more. I searched  on Penryn memorial garden heritage trail.

Google test search omitting terms 1

This gave me 15,900 results but Google had decided to leave out Penryn so I was seeing plenty of information about heritage trails but they were not all in Penryn. I prefixed Penryn with intext: to force Google to include it in the search but then the word heritage was dropped. I applied Verbatim to the search without the intext: command.

Verbatim-failure-2

This gave me 732 results but even though I had applied Verbatim Google had dropped 'memorial' from the search. I prefixed memorial with intext: and got 1230 results with little change to the top entries. And no, I have no idea why there are more hits for this more specific search. I can only assume that other terms were omitted but I was not seeing that in my top 50. I then did what I should have done right from the start and searched on Penryn, and "memorial garden" and "heritage trail" as phrases. When Verbatim was applied this came back with  22 results but no detailed information about the garden. I started to tweak the search terms a little more. Verbatim would drop one, I would 'intext:' them and they were then included but I began to suspect that I was being too specific. So I dropped "heritage trail" from the search and cleared Verbatim: 19,300 results with all of the top entries being relevant and informative.

Search on Penryn memorial garden

This emphasises that it often pays to keep your search simple, and I mean really simple. Including too many terms, however relevant you may think they are, can be counter-productive. I would have realised earlier that my strategy was too complex had Verbatim behaved as I assumed it would and it had included all of my terms with no variations or omissions.

I ran a few of my test searches to see if this is now a regular feature. One was:

prevalence occupational asthma diagnosis agriculture UK


The results came back as follows:

Ordinary search - prevalence missing from some of the documents, 1,750,000 results
Verbatim search - diagnosis and agriculture missing from some of the documents, 15,300 results
Verbatim with quote marks around missing terms - same results as plain Verbatim with diagnosis and agriculture still missing
Verbatim search but prefixing missing terms with intext:, 14,200 results


I changed the search slightly to:

incidence occupational asthma diagnosis agriculture UK


Some of the results were:

Ordinary search - incidence and agriculture missing from some of the documents, 2,210,000 results
Verbatim search - incidence and agriculture missing, 15,500 results
Ordinary search on intext: incidence occupational asthma diagnosis intext:agriculture UK, 848,000 results
Verbatim intext:incidence occupational asthma diagnosis intext:agriculture UK, 15,000 results


I saw the same pattern with a few other searches. I also tested the searches in incognito mode, and both signed in and signed out of my Google account. There was very little difference in the results and Verbatim behaved in the same way.

It looks as though Verbatim still runs your search without any variations on your terms or synonyms but that it now sometimes chooses to omit terms from some of the documents.  To keep those terms in the search you have to prefix them with intext:. Double quote marks around the words are sometimes ignored. This is an unnecessary change and defeats the object of having an option such as Verbatim.

More worrying, though, is that Google obviously thinks Verbatim needs "fixing". But what it has done is to make the option more difficult to use, which in turn will result in people using it less often than they do already. And if Google sees that use is decreasing it will simply get rid of it altogether. Time to swot up on the few remaining Google commands, or use a different search tool.

If you are interested in learning more I am running workshops about Google and alternative search tools in September in London.

Friday 22 July 2016

Google's 'daterange:' command gone for good

It looks as though Google's daterange: command really has gone for good. Over the last 6 months it has been a case of "now it works, now it doesn't" but I've been testing it regularly over the past couple of months and it seems to have permanently stopped working . People have been reporting the problem in various forums since the start of this year.

So why bother using "daterange:" instead of the date/time option under Search tools? Because the latter does not work with Verbatim. It doesn't happen often but there are occasions when I need Google to search using my terms exactly as I have typed them without any omissions or variations AND limit the search to a specified time period. The only way to do that was to first run the search with the daterange included in the string and then apply Verbatim to the results.

It is getting to the point where  Google is totally useless for advanced, focussed research. What will be next for the chop? filetype? site? If you haven't done so already, it is time to learn how to use the alternative search tools. Cue blatant plug for my September workshop with UKeiG : Essential non-Google search tools !

Wednesday 13 July 2016

Alternatives to Google: Carrot Search and eTools.ch

Two of the services I cover in my workshop for researchers on alternatives to Google are Carrot Search and eTools.ch, and recently one of the people who had attended the session in April asked me to confirm what Carrot Search used  to provide its main results. Strictly speaking, neither Carrot Search nor eTools are Google free: eTools is a metasearch tool that has Google as one of its sources and Carrot Search uses eTools for its web search. At the start of the year, Carrot Search offered 7 options for searching under tabs across the top of the search screen including Web, "wiki", Bing, News, Images, PubMed and Jobs. Web search used eTools.ch to provide the results.

[caption id="attachment_3688" align="aligncenter" width="500"]Carrot Search Carrot Search - beginning of 2016[/caption]

The range of options has now been reduced to just three: the more transparently labelled eTools Web Search, PubMed and Jobs.

[caption id="attachment_3689" align="aligncenter" width="477"]Carrot Search options July 2016 Carrot Search options July 2016[/caption]

 

This makes sense as the number of accesses to Bing via the api was always limited and I could never get the news or images options to work. eTools in any case is a metasearch engine covering 17 tools including Google, Bing and Wikipedia so the extra Carrot Search tabs did seem to be unnecessary. The full list can be seen on the eTools home page.

[caption id="attachment_3690" align="aligncenter" width="417"]eTools list of search engines eTools list of search engines[/caption]

This is where it gets interesting. It appears that Carrot Search does not just copy the results from a search on eTools.  I ran a search on Brexit in Carrot Search and compared the results from eTools Worldwide and eTools United Kingdom. All of the sets  were different so Carrot Search must be doing some additional analysis and processing.

Carrot Search doesn't just list the results but also organises them into topics or Folders that are displayed on the left hand side of the screen. These can be a useful way of narrowing down your search.

Carrot Search Brexit results

Carrot Search offers two other ways of displaying results: Circles and Foam Tree.

[caption id="attachment_3692" align="aligncenter" width="500"]Carrot Search Circles Carrot Search Circles[/caption]

 

[caption id="attachment_3693" align="alignnone" width="500"]Carrot Search Foam Tree Carrot Search Foam Tree - 13th July 2016[/caption]

Both show the density of terms in the top 100 results and allow you to click on an area to add the term or phrase to the search.  In addition I am finding that the Foam Tree is an interesting way of monitoring changes in news coverage and social media discussions on a topic, product or company. Yesterday, when I ran the search on Brexit, there was an area representing Theresa May.  Today, that had been replaced with one for David Cameron. I assume that is because the news coverage has been concentrating on David Cameron's last day as Prime Minister and his last Prime Minister's Questions (PMQ) in Parliament . Later he goes to see the Queen to officially resign as Prime Minister. Tomorrow,  with Theresa May as our new Prime Minister and a new Cabinet, the Foam Tree could have a very different structure so I shall be looking at it periodically to see if and how it reflects changes in events.

As I mentioned earlier eTools.ch, which is behind the main Carrot Search web search, is a metasearch engine covering 17 tools. It also has options to select a country from a drop down list (Worldwide, Swtzerland, Liechtenstein, Germany, Austria, France, Italy, Spain,  UK) and a language (All, English, German, French, Italian, Spanish). Either or both of these give you completely different views and opinions on a subject.

[caption id="attachment_3695" align="aligncenter" width="400"]eTools - Switzerland, all languages eTools - Switzerland, all languages[/caption]

 

[caption id="attachment_3696" align="aligncenter" width="400"]eTools_CH_French eTools - Switzerland, French[/caption]

 

[caption id="attachment_3697" align="aligncenter" width="400"]eTools - Spain, all languages eTools - Spain, all languages[/caption]

It is a convenient way of gathering a range of foreign language information, especially on European events, and is easier than searching individual country versions of Google or Bing. The disadvantages are that the range of countries and languages is limited and many of the articles will not be in English. Nevertheless, I often find it helpful at the start of a piece of research as I get a general feel for the type and range of information that is available.

Carrot Search and eTools.ch are just two of the tools that I cover in my workshop on alternatives to Google. If you are interested in finding out more, the next session is being organised by UKeiG and will be held in London on Wednesday, 7th September 2016. Further details are available on the UKeiG website.

Thursday 2 June 2016

Searching for the height of Ben Nevis - how hard can it be?

If you have attended one of my recent search workshops, or glanced through the slides, you will have noticed that I have a new test query: the height of Ben Nevis. It didn't start out as a test search but as a genuine query from me.  A straightforward search, I thought, even for Google.

I typed in the query 'height of ben nevis' and across the top of the screen Google emblazoned the answer: 1345 metres.  That sort of rang a bell and sounded about right, but as with many of Google's Quick Answers there was no source and I do like to double or even triple check anything that Google comes up with.

Ben_Nevis_1

To the right of the screen was a Google Knowledge Graph with an extract from Wikipedia telling me that Ben Nevis stands at not 1345 but 1346 metres above sea level. Additional information below that says the mountain has an elevation of 1345 metres and a prominence of 1344 metres (no sources given). I know have three different heights - and what is 'prominence'?

Ben-Nevis-3

After a little more research I discovered that prominence is not the same as elevation, but I shall leave  you to investigate that for yourselves if you are interested. The main issue for me was that Google was giving me at least three slightly different answers for the height of Ben Nevis, so it was time to read some of the results in full.

Before I got around to clicking on the first of the two articles at the top of the results, alarm bells started ringing.  One of the metres to feet conversions in the snippets did not look right.

Height of Ben Nevis search results 3

So I ran my own conversions for both sets of metres to feet and in the other direction (feet to metres):

1344m = 4409.499ft, rounded down to 4409ft

4406ft = 1342.949m, rounded up to 1343m

1346m = 4416.01ft, rounded down to 4416ft

4414ft = 1345.387m, rounded down to 1345m

As if finding three different heights was not bad enough, it seems that the contributors to the top two articles are incapable of carry out simple ft/m conversions, but I suspect that  a rounding up and rounding down of the figures before the calculations were carried out is the cause of the discrepancies.

The above results came from a search on Google.co.uk. Google.com gave me similar results but with a Quick Answer in feet, not metres.

Ben-Nevis-4

We still do not have a reliable answer regarding the height of Ben Nevis.

Three articles below the top two results were from BBC News, The Guardian and Ordnance Survey - the most relevant and authoritative for this query -  and were about the height of Ben Nevis having been remeasured earlier this year using GPS. The height on the existing Ordnance Survey maps had been given as 1344m but the more accurate GPS measurements came out at 1344.527m or 4411ft 2in. The original Ordnance Survey article explains that this is only a few centimetres different from the earlier 1949 assessment but it means that the final number has had to be rounded up rather than down. The official height on OS maps has therefore been increased from 1344m to 1345m.  So Google's Quick Answer at the top of the results page was indeed correct.

Why make a fuss about what are, after all, relatively small variations in the figures? Because there is one official height for the mountain and one of the three figures that Google was giving me (1346m) was neither the current nor the previous height. Looking at the commentary behind the Wikipedia article, which gave 1346m, it seems that the contributors were trying to reconcile the height in metres with the height in feet but carrying out the conversion using rounded up or rounded down figures. As one of my science teachers taught me long ago, you should always carry forward to the next stage of your calculations as many figures after the decimal point as possible. Only when you get to the end do you round up or down, if it is appropriate to do so. And imagine if your Pub Quiz team lost the local championship because you had correctly answered 1345m  to this question but the MC  had 1346m down as the correct figure? There'd be a riot if not all out war!

That's what Google gave us. How did Bing fare?

The US and UK versions of Bing gave results that looked very similar to Google's but  with two different quick answers in feet, and neither gave sources:

Bing UK

Ben-Nevis-Bing-UK

Bing US

Bing-Ben-Nevis-US

I won't bore you with all of the other search tools that I tried except for Wolfram Alpha. This gave me 1343 meters or 4406 ft. At least the conversion is correct but there is no direct information on where the data has been taken from.

Ben-Nevis-WA

The sources link was of no help whatsoever and referred me to the home pages of the sites and not the Ben Nevis specific data. On some of the sites, when I did find the Ben Nevis pages, the figures were different from those shown by Wolfram Alpha so I have no idea how Wolfram arrived at 1343 meters.

So, the answer to my question "How high is Ben Nevis?" is 1344.527m rounded up on OS maps to 1345m.

And the main lessons from this exercise are:

  1. Never trust the quick answers or knowledge graphs from any of the search engines, especially if no source is given. But you knew that anyway, didn't you?

  2. If you are seeing even small variations in the figures, and there are calculations or conversions involved, double check them yourself.

  3. Don't skim read the results and use information highlighted in the snippets - read the full articles and from more than one source.

  4. Make sure that the articles you use are not just copying what others have said.

  5. Try and find the most relevant and authoritative source for your query, and ideally a primary source. In this case it was Ordnance Survey. GB officially taller - Ben Nevis  https://www.ordnancesurvey.co.uk/about/news/2016/gb-officially-taller-ben-nevis.html

Friday 27 May 2016

Small companies now allowed to be bigger ... or smaller

One of the services I provide is company research including official registry documents and accounts. Many registries, including, the UK's Companies House, make a significant amount of their data available free of charge. Some still charge for documents and a few insist that you register before you can even search for a company.  If I know the information is freely available I usually point the client at the relevant website but a few people come back to me when they discover that the interface is in a foreign language. If registration and/or payment are required I'm often asked to search on the client's behalf because they just do not want the hassle of going through the registration process and recouping the cost of a small overseas transaction from their accounts department.

Regardless of whether the information is free or charged for, I often receive what I call a second stage request for more detailed accounts. Why is there no Profit & Loss? Where is the revenue/turnover figure? I then have to explain how the reporting and filing requirements differ depending on the country, or even state; and then I have the joy of taking the client through small company exemptions. Some people I know have only just got their head around the changes introduced by UK Companies Act 2006. I now have to tell them that this has changed yet again.

In March 2015 the UK Government approved new regulations that implement the requirements of the new EU Accounting Directive. The changes came into effect in the UK from 1 January 2016. There are a number of changes, which may reduce yet further the amount of information that small companies are required to provide, and there are also changes to what is deemed to a be a "small" company.  Small companies can now be bigger.

A company can now qualify as small if meets at least two of the three following criteria:

  • turnover not more than £10.2m (previously £6.5m)

  • balance sheet total not more than £5.1m (previously £3.26m)

  • average number of employees not more than 50 (no change)


Information on some of the other changes can be found on the Companies House Blog - Changes to accounting standards and regulations. The key ones are:

"... the removal of the ability for a small or medium-sized company to file abbreviated accounts with us at Companies House. A company will now be required to file the accounts they prepare for their members at Companies House (although a small company or micro-entity will usually be able to choose not to file their profit and loss account or director's report)."

"However, this does not mean that all small companies are now required to file full accounts, the very smallest companies may disclose less information by preparing micro-entity accounts. Other small companies may, instead of filing full accounts, choose to prepare a set of abridged accounts for their members and then file these with us."

So, as well as "small" being allowed to be bigger we now have even smaller companies or "micro-entities" who can choose to disclose less information. The whole thing is beginning to look as clear as mud!

The ICAEW has a useful overview of what is happening at The revised UK small companies regime but if you want to keep up with the latest updates then follow the Companies House Blog.

Thursday 28 April 2016

Bing extends date search option

Bing has at last extended its date search options. Until recently one could only limit results to the past 24 hours, past week or the past month, and then only in Bing US.  Bing has now added a custom range on a par with Google.

Bing_Date_US_2

The UK version of Bing has not had a date option until now but bizarrely has added the old, limited US selection.

Bing-Date-UK-2It seems very strange that they haven't implemented the full US list. One can but hope that it will happen soon rather than in several years time, which is how long it has taken for this version to appear in Bing UK.

Advanced Google workshop - Top Tips

This collection of Top Tips is a combined list nominated by those who attended the UKeiG workshop on “New Google, New Challenges”. The next UKeiG Google workshop will be run on 8th September 2016.

1. Do not trust Google’s facts and answers
Google tries to provide facts and quick answers to your queries at the top and to the right of your results. These are computer generated extracts from pages and several different sources may be used to produce an “answer”. They are sometimes misleading or completely wrong. At the time of writing, the answer provided for a search on frugivore is an excellent example. (It explains why your cat is so fussy over its food – it is obviously craving its 5 a Day!) Always go to the original source to double check the information, but this is not always provided by Google.

2. Country versions of Google and /ncr
Country versions of Google give priority to the local content. This is a useful strategy when searching for research groups, companies and people that are active or working in a particular country. Use the standard ISO two letter country code, for example http://www.google.fr/ for Google France, http://www.google.it/ for Google Italy.

It is also worth trying your search in Google.com. Your results will probably be more international or US focused but you may see new search features or layouts in Google.com that are not yet available elsewhere. If Google insists on redirecting you to your own country version, go to the bottom right hand corner of the Google home page and you should see a link to Google.com. If there is no link then add ‘/ncr’ to the Google URL, for example http://www.google.com/ncr .

The downside of using country versions of any search tool is that the prioritised information is likely to be in the local language.

3. Search history
Your search history, which is recorded and available for you to view if you are signed in to your Google account, is used by Google to help personalise your results but it can also be useful as a record of past searches. If a user comes back to you having forgotten or lost the search and documents you gave them your search history should be able to help you find both. On any search results page click on the cog wheel in the upper right hand area of the screen and select History. You can then browse your history or select a date from the calendar (upper right and area of the History screen).

4. Verbatim
This is an essential tool for making Google carry out your search the way you want it run. Google automatically looks for variations on your terms and sometimes drops terms from your search, which is not always helpful. To use Verbatim, first run your search. Then click on ‘Search tools’ in the menu that runs across the top of your results page. A second row of options should appear. Click on ‘All results’ and from the drop down menu select Verbatim. Google will then search for your terms without any variations or omissions. Note that Google will search for documents and pages in which the words appear in any order. If you are searching on the title of a paper place the title within double quote marks to force an exact phrase match. If Google still alters your search then run Verbatim. 

Verbatim-Factsheet
If you are carrying out in-depth research it is worth trying out Verbatim even if the “normal” Google results seem OK. You may see very different and possibly more relevant content.

5. filetype: command.
An important advanced search command that is available not only in Google but in many alternative search tools. Use the filetype: command to limit your research to PowerPoint for presentations, spreadsheets for data and statistics, or PDF for research papers and industry/government reports.

For example:

plasmonic nanoparticles filetype:ppt

The command must be all lower case and there must be no spaces between the colon and the command or the file extension, otherwise Google will treat the command as a searchable word. Also you must search for pre and post Office 2007 file extensions separately as Google does not automatically pick up both.

For example

plasmonic nanoparticles filetype:ppt OR filetype:pptx

Note that Google’s Advanced Search screen pull down menu for filetype: only searches for pre Office 2007 extensions.

6. Minus sign to exclude information
Use the minus sign immediately before a term to exclude documents containing that term, but use with care as you may lose valuable information. It can also be used with commands to exclude file formats or websites from your search.

For example:

occupational asthma UK site:gov.uk -site:hse.gov.uk
-site:nationalarchives.gov.uk


7. Combine search commands
Combine multiple commands such as filetype: and site: to focus your search. Use the OR command to search for alternatives, for example:

occupational asthma UK site:ac.uk filetype:ppt OR filetype:pptx

8.Personalise Google News
Personalise Google News (http://news.google.co.uk) page when signed in to your account  and change what content is automatically displayed or add your own searches. Click on the Personalise button at the top of the right hand column. 

9. Google Scholar Cite feature
Click on the Cite link under a reference in Google Scholar and Google will give you options to import a citation in MLA, APA, Chicago, Harvard or Vancouver style into BibTex, EndNote, RefMan or RefWorks. Note that if the article is only available online you may need to add a doi or a URL, and the date of access.

10. Use Google site: search on Google scholar
This is one I had not thought of but was recommended by one of the delegates as a way of using Google’s advanced search commands on Google Scholar instead of Scholar's own. (I have not had time to test this one out myself).

Essential non-Google Search Tools - Top Tips

It has been a while since I did a Top Tips from my workshops so here is the first of two that came out of a couple of recent UKeiG events.  This collection of Top Tips is a combined list nominated by those who attended the workshop on “Essential non-Google Search Tools” on 12th April 2016 in London.

This particular workshop will be re-run later in the year on September 7th. See the UKeiG training pages for further details.

1.Use more than one search tool
Different search tools have different coverage, search features and sort results differently. If you are doing in depth research use more than one to make sure you are covering all aspects and use a tool that is most appropriate for the type of information you require.

2. filetype: command
An important advanced search command that is available not only in Google but in many alternative search tools. Use the ‘filetype:’ command to limit your research to PowerPoint for presentations, spreadsheets for data and statistics, or PDF for research papers and industry/government reports.

For example:

home ownership UK filetype:xls

Make sure that filetype is all lower case and that there are no spaces before or after the colon.

Unlike Google, most of the alternative general search engines will automatically search for both the pre Office 2007 file extensions (xls, ppt, doc) as well as the current ones (xlsx, pptx, docx) regardless of whichever version you specify.

3. Behind the Headlines - NHS Choices
http://www.nhs.uk/news/Pages/NewsIndex.aspx
This is an excellent site for tracking down the truth and the research behind sensational, front page stories about medical breakthroughs. It explains in plain English what the background is behind the story and whether or not the claims made by the newspaper articles are valid.Behind_the_headines_2

4. Million Short http://millionshort.com/
If you are fed up with seeing the same results again and again give Million Short a try. Million Short enables you to remove the most popular websites from the results. Originally, as its name suggests, it removed the top 1 million but you can now choose to remove the top 100, 1000, 10K, 100K, or million from your search. The page that best answers your question might be on a site that is not be well optimised for search engines, or might cover a topic that is so specialised that it never makes it into the top results in Google or Bing.

There are filters to the left of the results enabling you to remove or restrict your results to ecommerce sites, sites with or without advertising, live chat sites and location. The sites that have been excluded are listed to the right of the results and you can, if you wish, view the excluded pages by site.

5. Carrot Search http://search.carrotsearch.com/carrot2-webapp/search
Carrot Search was nominated for the Top Tips for its clustering of results into topics (left hand side of the results screen) that enable you to filter and focus the search, as well as the visualisations of terms and concepts via the circles and “foam tree”. This is always a popular search tool with those who prefer visualisations rather than just text as a way of presenting and refining results. Click on the Circles and Foam Tree tabs at the top and to the left of the results.

6. Compound Interest http://www.compoundchem.com/
“Compound Interest is a site that aims to take a closer look at the chemical compounds we come across on a day-to-day basis. It also provides graphics for educational purposes, both for teacher and student use.” It is run by Andy Brunning, a chemistry teacher based in Cambridge.

Recent topics include:

The Chemistry of Camembert http://www.compoundchem.com/2016/02/10/the-chemistry-of-camembert/
Chemistry History: Teflon & Non-Stick Pans  http://www.compoundchem.com/2016/02/04/teflon/
The Chemistry of an Electric Guitar http://www.compoundchem.com/2015/11/24/guitar/


7. Internet Archive: Wayback Machine http://archive.org/
Want to see what was on a website a few years ago or trying to track down a document that seems to have vanished from the web? Try the Internet Archive. Enter the URL of the website or document and you should then see a calendar of the snapshots that are in the archive. Choose a date from the calendar to view the page. The archive does not have everything but it is worth a try. See also the UK National Archives of old government websites and pages at http://www.nationalarchives.gov.uk/webarchive

8. UK Parliament http://www.parliament.uk/
Perfect for monitoring the progress of legislation through Parliament (http://www.parliament.uk/business/bills-and-legislation/). As well as following the progress of legislation you can view the documents associated with a Bill (explanatory notes, amendment papers, report stage procedures, select committee reports etc). RSS and email alerts are available for each Bill.

9. Tineye http://www.tineye.com/
Reverse image search tool for seeing where and when an image has been used. Either upload an image or enter an image URL. Sort the results by Best match (default), Most changed, Biggest image, Newest or Oldest. Browser plugins are available for Firefox, Chrome, Safari, IE and Opera.

10. Search for images by license
If you want to be sure that you are allowed to use an image for a project use a search tool that enables you to search by license. Bing has a license filter in its image search so that you need only view those that have the appropriate license. Run your search and use the drop down menu under License in the menu bar across the top of the results to apply a copyright filter.

Always go to the page hosting the image to check that the license does apply to the image you want and not to another one on the same page. (Google Images has a similar option).

Flickr Creative Commons (http://www.flickr.com/creativecommons) describes what the different licenses allow you to do and enables you to search for photos with that license.

Other tools that have Creative Commons or public domain images include:

Wikimedia Commons http://commons.wikimedia.org/  (but do check the full information on each image as there may be copyright restrictions under some jurisdictions)
MorgueFile.com  http://www.morguefile.com/
Geograph http://www.geograph.org.uk/  “UK and Ireland photos of landmarks and buildings for every Ordnance Survey grid”
Nasa http://www.nasa.gov/

Saturday 2 April 2016

Flickr no longer allows easy deletion of automatic tags

UPDATE: Flickr have now restored the option to delete their automatically generated tags

Flickr no longer allows users to easily remove the automatically generated tags that it adds to photos. Flickr has been using image recognition technology for a couple of years to automatically generate tags for users’ photos but didn't make them visible until May 2015.  As well as new photos, the computer generated tags had been added retrospectively to all previously uploaded photos. My own experience is that many  of the tags are useless and some are totally wrong. See my earlier posting Flickr pulls out all the stops with automatic tagging.

Flickr_Star_Anise_TagsUser generated tags are in a grey box and Flickr's automatic tags are in a white or light grey box. As the tags are used by Flickr when searching for images it is important that they are correct, and it explains why Flickr search results often contain irrelevant images.

Until now, both users' and Flickr's tags could be deleted. Hover over a tag and a cross would appear in the upper right hand corner enabling you to delete that tag. The cross no longer appears on Flickr generated tags so they cannot be deleted that way. There is a work around which is to manually add a tag that is identical to the one you want to remove and then delete the tag you have just added. This also deletes the corresponding Flickr tag.

Several people have commented that there is an option under Settings, Privacy and Permissions  that enables you to hide auto tags. This does exactly what it says on the tin:"hides" the tags. It does not remove them so they will still be used  by Flickr's search.

 

Wednesday 30 March 2016

Debunking Euromyths

Those of us living in the UK have become accustomed to sensational headlines in the British press warning us that the European Union (EU) is about to ban British cucumbers, sausages, cheese, church bells, street acrobats [insert food or activity of your choice]. Tracking down the relevant EU legislation to find out whether or not there is any truth in the stories is a nightmare, and they are not the easiest of documents to read and understand when you do find them. But help is at hand from an EU blog called "European Commission in the UK – Euromyths and Letters to the Editor" at http://blogs.ec.europa.eu/ECintheUK/.

The blog covers scare stories that have appeared in the UK press, some of which go back to 1992, and explains what the situation really is and the relevant legislation.

Euromyths A-Z

There is a neat A-Z index at   http://blogs.ec.europa.eu/ECintheUK/euromyths-a-z-index/ so you can quickly check, for example, if the EU is about to ban bagpipes:

"As for banning bagpipes, Scots can rest assured that their favourite musical instrument is not under threat from EU proposals on noise pollution ... they are designed primarily for those who work with loud machinery for a sustained period – more than 87 decibels for eight hours in a row. The law ... will apply only to workers rather than audiences.  If, in the highly unlikely event a bagpipe player is hired to play continuously for eight hours, and the noise created averaged more than 87 decibels, the employer would be obliged to carry out a risk assessment to see where changes can be made – tinkering with the acoustics in a hall to reduce echoes, for example. If that fails, personal protection such as earmuffs will need to be considered, but only as a last resort. Banning musical instruments is not an option. "


The blog is just one of many on the Europa website. A list can be found at Blogs of the European Commission.

Friday 11 March 2016

Flickr pulls out all the stops with automatic tagging

Flickr really went to town with its automatically generated tags for my photo today. The photo was of star anise, which I took for the Challenge Friday Group; this week's challenge is stars. A straightforward, simple photo of the spice, I thought but Flickr read a lot more into it.

This is the photo:

Star_Anise_20160311_Signed

And these are the tags:

Flickr_Star_Anise_TagsThe ones at the top with the grey background are the tags that I assigned to the photo. The rest are Flickr's.

I deleted several of them before I thought of taking a screenshot for posterity but what is left gives  you a flavour of the range of concepts that Flickr feels are relevant.  Some of them I agree with (pattern, star shape, symmetry). Others I shall delete  as they are of no help to anyone searching on them (foliage, leaf, landscape, tree, forest, blossom,  pastel).

I do, though, rather like "minimalism" even though the structure and complexity of flavour and aroma  of star anise is far from minimalist. It  stays.

Thursday 10 March 2016

Business information key resources and search strategies - Top 10

The participants  of the business information workshop I ran on March 8th  had a variety of interests: search strategies and commands for Google et al,  UK government information, statistics, open data, social media, companies, locating scientific research.  So it was quite tough limiting the Top Tips that I asked them to nominate at the end of the day to just 10.

This is what they came up with.
  1. Get to know the key resources and starting points for different types of business information e.g. Companies House, OFFSTATS and go direct to those rather than Google. It will save you time in the long run.

  2.  Verbatim. An invaluable tool for research when Google insists on rewriting your search and dropping terms. To make Google search for all of your terms without variation, but in any order, first run your search. Then click on ‘Search tools’ in the line of options above your results. In the second line of options that appears click on ‘All results’ and from the drop down menu select Verbatim. If you are carrying out in-depth research it is worth using Verbatim even if your “normal” Google results seem to be OK. You may see very different content in the Verbatim list.

  3. Combine advanced search commands such as site: and filteype: to focus your search on types of information (PDF reports, PPT presentations, spreadsheets containing data) and websites (government, academic, individual sites). Also try using the minus sign to exclude documents containing specific terms or sites that are irrelevant.

  4. Phil Bradley’s UK Newspapers Google Custom Search Engine. http://www.philb.com/nationaluknewspapers.html
    A relatively new tool that enables you to search all of the major national UK newspapers and regional newspapers. A real time saver if you are searching for local information on a local business or entrepreneur and don’t want to have to track down all the local papers and search them one by one.

  5. OFFSTATS - The University of Auckland Library http://www.offstats.auckland.ac.nz/ A good starting point for official statistical sources by country, region subject or combination of categories. All of the content in the database has been chosen and quality assessed by staff at The University of Auckland Library.

  6. Zanran http://zanran.com/ A tool for searching information contained in charts, graphs and tables of data. Enter your search terms and optionally limit your search by date and/or format type. Zanran comes up with a list of documents that match your criteria with thumbnails to the left of each entry. Hover over the thumbnail to see a preview of the page containing your data and further information on the document.

  7. Advanced Twitter Search. http://twitter.com/search-advanced Essential tool if you are using Twitter to look for news on product developments, announcements, conferences, discussions on technologies/companies, or how companies interact with customers.

  8. Wayback Machine http://www.archive.org/ Want to see what was on a website a few years ago or trying to track down a document that seems to have vanished from the web? Try the Internet Archive Wayback Machine at http://www.archive.org/. Enter the URL of the website or document and you should then see a calendar of the snapshots that the archive has. Choose a date from the calendar to view the page. The archive does not have everything but it is worth a try. See also the UK National Archives of old government websites and pages at http://www.nationalarchives.gov.uk/webarchive/

  9. OUsefulInfo, http://blog.ouseful.info/ “Trying to find useful things to do with emerging technologies in open education and data journalism”. Maintained by Tony Hirst, this blog has useful information and descriptions of what can be involved when dealing with and manipulating open data.

  10. DuckDuckGo  http://duckduckgo.com/ This was not covered in the workshop but one of the participants recommended it as a useful alternative to Google. Aside from the absence of tracking and personalisation it provides different and a greater variety of results when compared with Google.
Edited highlights of the workshop slides can be found on authorSTREAM and Slideshare.

My next business information related workshop is Discover Open Data on  the 7th April.  The  next advanced Google workshop (New Google, New Challenges) is on the 13th April and the Essential non-Google search tools is on  the 12th April.

Tuesday 9 February 2016

Google advanced search - get it right!

When running advanced search workshops, and especially Google sessions, I prefer not to dwell on commands and search options that are no longer supported. They are gone and that is that, and it is far better to concentrate on how to get the best out of what is left. Of course it is unavoidable when your slides have been prepared several days before the event and  Google decides to pull the plug on one of your favourite search features just before you start! Similarly I tend not to show "this is how NOT to use...." a command  or incorrect syntax. It is often the incorrect format that one remembers.  Recently, though, I have added slides  to my presentations that cover both defunct commands and errors in syntax and format.

The problem is that not only are many people unaware that some search options are no longer available but also some fact sheets and articles covering advanced search are getting it wrong. The recent Guardian article on top search tips for Google almost got it right  but referred to the tilde, which was dropped in 2013, and did not really understand how Google automatically looks for synonyms and variations on a term (see my earlier blog posting Guardian’s top search tips for Google not quite tiptop). I have also seen a couple of recently produced Google  fact sheets riddled with mistakes.

The wonderful thing about Google is that it can take the most tortuous and error ridden search string and still come back with something that is sensible - most of the time.  The downside of this is that one assumes the search query has worked as intended when in fact Google has totally rewritten the search for you.  At some point, though, Google will rewrite the search in such a way that it brings back rubbish. So, it is important to know what commands are available and how they should be used.

Let's get started.

Plus (+) sign before a word to force an exact match.

This was discontinued in October 2011 because Google intended to use it as a way to search for Google+ pages. That has been abandoned and it is now a searchable character.  If you want to force an exact match search on a term precede the term with intext: for example intext:agriculture.

I have also seen examples claiming that a plus sign between words acts as a Boolean AND. No, it doesn't.  If you do get different results when using + it is because Google is searching for that as well as your terms.

Tilde  (~) for synonyms

This was withdrawn in  June 2013  because not many people used it and it was no longer needed. Google now looks for synonyms by default.

thesaurus: for alternative terms

'thesaurus:' sort of works because Google treats 'thesaurus', having ignored the colon,  as a search term. So 'thesaurus:eclectic' will give you links to pages and websites of dictionaries and definitions that give synonyms for eclectic. It does not give you a straightforward list of alternatives in the same way that 'define' does. If you use thesaurus  you have to go the websites in turn to view the synonyms.

The asterisk *

The asterisk (*) is a placeholder for terms between two words e.g. solar * panels finds solar photovoltaic panels, solar PV panels, solar thermal panels. It is NOT a truncation symbol. Again, you might think it is because Google ignores the asterisk and automatically looks for  words that begin with the letters you have typed in.

The example I gave in my earlier blog posting was a search on phenobarb*. I expected Google to pick up references to phenobarbitone. It picked up 76,000 results including phenobarbital but there was no mention of phenobarbitone in the first 100.  Phenobarb without the asterisk picked up the exact same results.  A search on phenobarbitone, with and without the asterisk came up with 241,000 results. I have no idea how or when Google decides to stop looking for variations on your string but it is obvious from the above example that the asterisk is not a truncation symbol.

Do NOT capitalise the first letter of commands, and NO spaces

Commands such as intitle:, intext:, filetype: and site: must be all lower case and NO spaces between the colon and the search term. Capitalise the first letter or add a space after the colon or both and Google treats the command as an ordinary searchable word.

The correct format for an intitle: search is, for example, intitle:caversham and finds the following:



Capitalise the first letter of the command or insert a space or both and you find:



I do understand why so many fact sheets, and presentations, show commands with an initial capital letter. You spend ages preparing your information and when you have sent off your slides for printing or converted your document to a PDF you discover that Microsoft Office has changed the format of the command. Because your search example is on a separate line with the command at the start Office, bless it, decides to auto-correct and capitalise the first letter. I know, it has happened to me! So, please, check and double check your support materials.

Google searches for all of your terms by default

Not always. If your search, as it stands, finds zero or a low number of results then Google will drop one or more terms that are usually shown as strikethroughs.  In the above screenshot you can see that the third entry in the results has a "Missing:
caversham" at the end of the snippet.

If Google is dropping a term that is essential to your search then prefix it with intext:, for example intext:caversham. If you want all of your terms to be included, and without any variations, then use the Verbatim search option.  If you are using a desktop or laptop run your search and then click on the Search tools option at top of your results. A second line of options will appear. Click on All results and select Verbatim.  The layout and location of Verbatim on mobile devices will usually be different.

Double quotation marks around phrases

Double quotation marks around phrases, titles of papers, song titles, famous sayings etc. works most of the time. But, again, if Google finds zero or only a handful of results it will ignore the marks. Google may also alter the spelling of one or more words within the double quotation marks. Use Verbatim if you are sure that the phrase is correct and you want to bring Google to heel.

Full nested Boolean search

Google has NEVER supported full nested Boolean search. I still meet people who are adamant that Google does, but when pushed they admit that they often get unexpected results.  You can , though, use OR for alternative terms and the minus sign before a term to exclude documents containing that term.

This is how Google interprets the search (confectionery OR chocolate) AND (production OR manufacture) AND (france OR Germany OR UK OR switzerland) NOT belgium



Note that pages containing Belgium are included rather than excluded.

Remove the 'NOT Belgium' and this is what we see:



Add '-belgium' to the end of the search instead of 'NOT belgium' and we get:



Running Verbatim on our original Boolean search shows that Google is treating AND and NOT as lower case, searchable words:



If you want to learn more about Google search Dan Russell, who works at Google,  is currently running an online course on Power Searching with Google.  Alternatively, if you want a more business or academic research and UK/European oriented workshop on what Google can do I am running an advanced Google workshop with UKeiG on April 13th, 2016.