Karen Blakeman's Blog: 2017

Sunday 29 October 2017

Google makes it harder to change location for country specific research

Google has made a major change to search and it does not bode well. Results are now based on your current location. So what's new? Google has always looked at your location, even down to city/town level, and changed the results accordingly. That is fine if you are travelling and want to find the nearest Thai restaurant via your mobile, for example. Presenting a list of eateries in my home town of Reading is no good to me if I'm away in Manchester and getting very hungry!

The problems start if you are researching a person, company or industry based in a country other than your own - let's use Norway as an example - or just want the latest news from that country. The trick used to be to go to the relevant country version of Google, in this case www.google.no, run your search and Google would give preference to Norwegian content. It is a great way to get alternative viewpoints on a topic and more relevant "local" information on a subject. Now, regardless of which version of Google you go to, you will see the same results tailored for your home location.

In a blog posting Making search results more local and relevant Google says:

Today, we’ve updated the way we label country services on the mobile web, the Google app for iOS, and desktop Search and Maps. Now the choice of country service will no longer be indicated by domain. Instead, by default, you’ll be served the country service that corresponds to your location. So if you live in Australia, you’ll automatically receive the country service for Australia, but when you travel to New Zealand, your results will switch automatically to the country service for New Zealand. Upon return to Australia, you will seamlessly revert back to the Australian country service.

This confirms that mobile search is what Google is concentrating on. After all it is, one assumes, where Google makes most of its money but it does not help professional researchers.

There is a way around it but it is rather long-winded. You need to go to Settings - use either the link in the bottom right hand corner of your Google home page or the one near the top of a search results page - and click on Advanced Search .

On the Advanced Search screen scroll down to “Then narrow your results by…” and use the pull down menu in the region box to select the country.

I ran a search on Brexit in google.co.uk, google.no and a few other country versions of Google. All gave me essentially the same results.

Using the region filter and selecting Norway as the country I am given the following by Google:

Notice, though, that Google is giving me English articles or English versions of them. Google has decided that I would prefer English articles and I have to scroll down to number 10 and beyond to see pages in Norwegian. To get a broader view of what is being said in Norway about Brexit I have to go back into settings, click on Languages and choose Norwegian/Norsk.

Brexit search with region and language filter on

Oh - and you get slight different results if you go through a VPN and set Norway as the country.

What worries me even more is that Google could do away with the advanced search screen and the region filter with it.

Google says:

We’re confident this change will improve your Search experience, automatically providing you with the most useful information based on your search query and other context, including location.

No, Google. You have just made things more difficult for those of us who conduct serious, in-depth research. The way I feel about this change at the moment is that if you were a person I would take a baseball bat to your head!

UPDATE: In response to David Pearson's comment and reminder below.
Including a site command e.g. site:no in the search works relatively well for this particular example (Norway) and gives good but slightly different results. It will, of course, miss Norwegian sites that are registered as .com or other international domains. The amount of overlap (or lack of it) will vary depending on the country. It's another one to add to the list of strategies, which I am sure will become longer, for dealing with this problem.

Wednesday 16 August 2017

RIP OFFSTATS

I'm back at work from an extended break only to find that my favourite statistics portal OFFSTATS is no more :-( https://www.library.auckland.ac.nz/about-us/collections/decommissioned-databases

I received an email from them explaining that they no longer have the resourcing available to maintain and develop the database. Also, as much of the content can now be discovered through other approaches they felt the need for this type of search tool was not so relevant as it had been a few years ago.

A shame but understandable from their point of view. It was always a popular resource on my search workshops and often featured in the participants' "Top Ten Tips". It was one of the few resources of this type in which humans assessed and monitored the quality and relevance of the sites listed. Very sorry to see it go.

Saturday 10 June 2017

Another example of Google's Knowledge Graph getting it wrong

Voting in the UK election has finished and the results are in, but the dust has most definitely not settled. It looks as we in the UK are in for interesting times ahead. It would help those of us researching the various political parties and policies if Google could at least get the basics right, such as who is now the Member of Parliament for a particular constituency. I am in Reading East and we have switched from a Conservative MP to Labour (Matt Rodda). Out of curiosity, I tried a search in Google on Reading East constituency. This is what Google's Knowledge Graph came up with:

I took this screenshot yesterday (Friday, 9th June) at around 8 a.m. and expected to see Rob Wilson given as the MP throughout . I was impressed, though, to see that the snippet from Wikipedia correctly gives Matt Rodda as our MP. Whoever had updated that entry was pretty quick off the mark. Possibly a Labour Party worker? The rest of the information, which is taken from Google's database of "facts", is either wrong, confusing or nonsensical.

"Member of Parliament: Rob Wilson" - wrong. But he was MP until around 4 a.m. on the 9th June when the result of the election in Reading East was announced, so perhaps I am expecting a little too much from Google to be that quick about updating its facts.

"Major settlement: Reading" - yes we are part of Reading but I find it strange that it is referred to as a major settlement rather than a town.

"Number of members: 1" - not sure why that is there as each constituency can only have one MP.

"Party: Conservative" - correct for Rob Wilson but the new MP is Labour.

"European Parliament constituency: South East England" - correct!

The final two lines "Replaced by:" and "Created from:" had me totally flummoxed. The entries are the same - Reading North, Reading South, Henley. Reading North and Reading South were constituencies formed by splitting the Reading constituency in 1950. They were then merged back into Reading in 1955, re-created in 1974, and in 1983 Reading East and West were formed (Yes, it's complicated!). As for Henley, it is not even in the same county. I can only think that this comes from Caversham (now part of Reading East) being part of Oxfordshire until 1911, when it probably did fall within the Henley constituency. The "Replaced by" is wrong because Reading East has not been replaced by anything. Google can't even blame a template that has to be filled in with information at all costs because different information appears in the Knowledge Graph depending on the constituency.

Here is the information for Aylesbury:

And the one for Guildford:

Going back to the how up to date the information is, how quickly does Google update their "facts". Rob Wilson was still our MP mid Friday afternoon. I submitted feedback using the link that Google provides at the bottom of each Knowledge Graph but this morning (10th June) nothing had changed. I'll update this posting when it does change.

I would hope that most people would look at the other links in the search results, in this case the latest news, but preferably a reliable authoritative source. The list of MPs on the UK Parliament website would be an obvious choice but might take a day to be updated after an election. Just don't rely on Google to get it right.

Tuesday 14 February 2017

More Google weird results

Ok, we know that Google often does strange things with our searches but much of the time it is not obvious that something odd has happened. There are usually some "good enough" answers scattered through the first 20-30 results so that we shrug off the rest as "well, that's Google for you". Occasionally, though, one comes across a search that seems to break Google. One such example was reported on Twitter this morning by Rand Fishkin (@randfish). The search was

this is the best * on the internet

At the top of the first results page Google reported that it had found over a billion results but when @randfish moved to the next page Google showed just "2 of 12 results"! Whatever happened to the other billion or so?

I tried the search myself on my laptop and straightaway got three results but on repeating it that was reduced to two.

I repeated the search having logged out of my Google account, cleared cookies, used Incognito and different browsers. Same results.

I tried a phrase search and the number of hits increased to 17.

Then I removed the quotation marks, got back to my original set of two and ran Verbatim on it. Over a billion hits but, bizarrely, Google claimed to have gone straight page 2!

Note: you normally can't see the number of results after you have run Verbatim because it is obscured by a second menu line. You can toggle between that menu and the number of hits by clicking on the Tools button.

Then I tried a phrase search followed by Verbatim: two results but different from my first set.

I could have gone on trying various advanced search commands but it is very clear that Google is having problems with this particular search. And, no, I have no idea what is going on here.

If Google messes with your search to this extent or comes back with far fewer results than you would expect don't struggle with it; just go to another search engine. As an asterisk is used in this search to stand in for a missing word Yandex.com would be the best option. (See https://yandex.com/support/search/how-to-search/search-operators.html for a list of the main operators).

Friday 10 February 2017

New Creative Commons image search - back to the drawing board I'm afraid

Locating images that can be re-used, modified and incorporated into commercial or non-commercial projects is always a hot topic on my search workshops. As soon as we start looking at tools that identify Creative Commons and public domain images the delegates start scribbling. Yes, Google and Bing both have tools that allow you to specify a license when conducting an image search but you still have to double check that the search engine has assigned the correct license to the image. There may be several images on a webpage or blog posting each having a different copyright status and search engines can to get it wrong. Flickr's search also has an option to filter images by license and there are sites that only have Creative Commons photos, for example Geograph. But the problem is that you may have to trawl through several sites before you find your ideal photo.

Creative Commons has just launched a new image search tool that in theory would save a lot of time and hassle. You can find some background information on the service, which is still in beta, at Announcing the new CC Search, now in Beta. The search screen is at http://ccsearch.creativecommons.org/.

The Creative Commons collections are currently included in the search come from the Rijksmuseum, Flickr, 500px, New York Public Library and the Metropolitan Museum of Art. You can search by license type, title, creator, tags and collection.

As well as search there are social features that allow you to add tags and favourites to objects, save searches, and there is a one-click attribution button that provides you with a pre-formatted text for easy attribution. There is also a list creation option. To make use of these functions you need to register, which at present can only be done via email.

I started with a very simple search: cat

Hover over the image and you have options to Save to a list and to favourite it. It will also show you the title of the image and who created it. Click on the image and you are shown further information including tags together with a link that takes you to the original source.

So far, so good although I did think it rather odd that the image should have tags for both norwegian forest cat and nebelung but assumed that perhaps the cat was a cross between the two.

I decided to narrow down the search to norwegian forest cat, and this is where things started to go very wrong. There were a handful of cats but the rest seemed irrelevant. I put the terms inside quotation marks "norwegian forest cat". It made no difference.

I had a look at one of the non-cat images and the reason it had been picked up was that the creator called themselves Norwegian Forest Cat! So I unticked the options on the search screen for creator and title, leaving just the tags. At least the results were now cats but most did not look anything like norwegians.

CC Image search Norwegian Forest Cat in tags

I looked at the tags for one of the short haired mogs.

It seems that this is a very special creature. It is both a domestic long haired cat and a domestic short haired cat, a norwegian forest cat and a manx, a european shorthair and an american short hair. The creator of this photo must have had a brainstorm when allocating the tags, or perhaps Flickr's automatic tagging system had kicked in? It does sometimes come up with truly bizarre tags. I clicked through to Flickr to view the original.

The original tags were very different. The two sets had only cat, pet, and animal in common. I have no idea where the tags on the CC photo page had come from and could not find any information on how they had been assigned. This was repeated with all of the dozen images that I looked at in detail.

I decided to give up on cats and try one of my other test searches: Reading Repair Cafe. I know that there are about 75 images on Flickr that have been placed in the public domain. I know that because I took them. To make it easier on CC Search I choose to search titles and tags, and just the Flickr Collection. The results were total rubbish.

Looking at the details of the photos it became clear that CC Search is carrying out an OR search. Phrase searching did not work and using AND just created a larger collection of irrelevant images. (I confess I gave up after trawling through the first 12 pages). After the cat experience I checked the tags on a few photos but no sign of Reading Repair Cafe anywhere.

A search on Flickr and using the license filter worked a treat:

Google did a pretty good job too but to get perfect results I had to do phrase search. (Note: as this is a regular test search of mine, I signed out of my Google account and went "Incognito" to stop Google personalising the results. )

Bing also did an excellent job at finding the photos.

Admittedly, CC Image Search is a prototype and in beta so one would expect there to be a few glitches. However, glitches seem to be the norm. I ran several more tests and the main stumbling block is that it combines terms using OR. There is no other option or any commands one can use to change that. My second concern is where on earth do the tags on the CC Search photo pages come from? Most of them do not appear on the original source page and many are completely wrong. I'm afraid it is back to the drawing board for CC Search.

Wednesday 11 January 2017

Google link command gone - never much good anyway!

Search Engine Roundtable reports today that Google is advising against using the link operator in search. It seems that there have been complaints on Twitter and elsewhere that it is returning some odd results.

I have never been a fan of the command; it only ever returned a small sample of pages that link to a known page, so I don't mention it in my workshops unless asked about it by one of the participants. When I saw the advice from Google I gave it a final go on my own domain rba.co.uk and got nearly 300,000 hits. "Wow," I thought, "amazing!" Glancing through the first few results it became obvious that Google had ignored all the punctuation and was running a text search and looking for variations on rba including RBS (Royal Bank of Scotland).

No great loss, but a sign that other more useful operators and commands may be for the chop.