Archive - Analytics RSS Feed

From Information to Knowledge. Googles Knowledge Graph

Google has announced it is enhancing it’s search engine – moving from an “information engine to a knowledge engine”.  It looks pretty cool.  While not yet available everywhere (those of us in Australia will have to wait a while) it is being rolled out in the US.  As The Conversation explains very well:

The aim is to provide a more intelligent search engine – one that isn’t based on simply matching strings (a sequence of characters, such as a word) to single web pages. Instead, the Knowledge Graph will “understand” what you’re searching for and provide more relevant and precise information.

…About a year ago, Google acquired Freebase, an “open shared database of the world’s knowledge”. It’s a repository of structured knowledge describing over 20 million entities, each with a unique identifier, a type (people, place, book, film, building) and a set of properties (e.g. date of birth for a person, latitude and longitude for a place).

Each entity is represented by a topic node in the massive graph that underpins the database. Properties can be used to specify relationships between entities and topics.

Freebase’s approach builds on “semantic web” technologies and the more recent Linked Open Data movement. The Semantic web has been driven to a large degree by Tim Berners-Lee, the inventor of the World Wide Web. While the concept of the Semantic Web has been around for more than ten years, the vision described in a 2001 Scientific American article by Berners-Lee has largely been unrealised.

Google has tried many different initiatives to varying degrees of success – such as Google Wave, Google Health, Google+, but where Google’s forte is search and this initiative will, I hope, mean that we not only get more information from our searches, but also learn more.

 

 

pf button From Information to Knowledge.  Googles Knowledge Graph

New research plus twitter. Does it make a difference in the clinic.

I first published this article on BodyInMind – a pain research site that I also am involved in.  The tricky question of how we measure, properly measure, the impact in the clinic of disseminating research using social media has come up time and time again in our meetings and as yet we have found no answer.

twitter4 512 New research plus twitter. Does it make a difference in the clinic.A recent article in the Journal of Medical Internet Research (JMIR)[1] looked at whether it is feasible to measure social impact of, and public attention to, newly published research articles by analysing buzz in social media – specifically twitter. It also asked whether these metrics are sensitive and specific enough to predict highly cited articles – something that would be valuable for researchers to know.

It might seem a strange thing to use. Twitter is a vehicle for people to communicate to their chosen network, limited to 140 characters per ‘tweet’. How can this chatter be used to predict whether a journal article will be highly cited in the future?

How is research evaluated at the moment? Well at least in two ways. Citations measure the productivity and impact of a researcher, and the impact factor evaluates the impact of a journal. However, citations only measure uptake within the scientific community and take a long time to gather. The impact of research in the real world and uptake by the public is very hard to measure and currently there is no really accurate way of doing it, something which this research hoped to address.

So how was this research done?  Over a period of 3.5 years tweets with links to JMIR were gathered and from these 1600 tweets (or ‘tweetations’[2]) talking about 55 articles in a 2 year period were analyzed. Social media impact was compared against data from Scopus and Google Scholar 17-29 months later (which is how long it takes to gather citations traditionally).

Using this data a new algorithm was devised and tested to see if it was possible to gauge accurately whether an article would be highly cited within one week of publication in JMIR (bearing in mind that this can take up to 2 years to find out).

The author found that if an article is highly tweeted then there is a 75% likelihood that it would end up in the top quartile of all articles of an issue, ranked by citations. Most tweets were sent on the day of publication: 44% of all tweets in a 2 month period, 18% on the following day followed by a rapid decay. In other words, tweets can predict highly cited articles within the first 3 days of article publication. Low impact articles are tweeted and retweeted mainly on day 0 and 1. Highly cited articles continue to be retweeted widely.

The so what factor

We discussed this article as part of our weekly BiM meeting – along with eating some Very Excellent Tiramisu that Luke made – and there are some questions as to bias in this article. The first is that the author, Gunther Eysenbach is founder and editor-in-chief of the JMIR. This journal is open access (freely available) and covers research, information and communication in the healthcare field. As a topic this article is well suited to the journal but it may have been better if it had been peer reviewed and published in another journal, PLoS one perhaps.

The second is that the author is coining new phrases (such as twimpact[3]) introduced as part of his research and has set up websites with that name in the hope, we presume, that the algorithm and metric becomes widely used.[CORRECTION: the Twimpact website is NOT associated with Professor Eysenbach or this research - see his comment on the original article at bodyinmind.org)

There are also some caveats with the research which the author himself points out. Although top cited articles can be predicted from tweeted articles, social impact measures can only complement traditional citation metrics but not replace them.

For example, tweetations are a metric for social impact and how quickly new knowledge is taken up by public, whereas citations are a metric for scholarly impact. They measure uptake by or interest of different audiences. The twimpact factor (cumulative number of tweets after a certain number of days) complements the impact factor in that it is a useful metric to measure uptake of research findings resonating with the public in real time.

At the moment we also don’t know if the twitter mentions are the result of someone influential tweeting and people getting on the popularity bandwagon or if it reflects the actual quality of the article. It only shows us how the question or topic (and possibly conclusions if the article has actually been read) resonates with Twitter. In other words we may be measuring the structure of the network and attributes of social media communities rather than the attributes of the information itself.

Popularity is a useful measure for commercial enterprises but those that do not resonate with the general public, eg low income old age groups, and who are not represented on twitter may lead to further marginalization of these groups.

This is still a very new field and the author (as the editor of JMIR) has issued a standing call for papers to ‘assess the robustness of these social media metrics and their ability to detect signals among the noise of social media chatter’.

He rightly points out that attentiveness to issues is a prerequisite for social change, and tweets are a useful metric to measure attentiveness to a specific scholarly publication. For us at BiM I wonder whether we can use new social media avenues to get the explain pain message out more effectively.  What we can’t yet do is measure what effect, if any, this has at the level of patient care.

Definitions and Reference

[1] Eysenbach, G. (2011). Can Tweets Predict Citations? Metrics of Social Impact Based on Twitter and Correlation with Traditional Metrics of Scientific Impact Journal of Medical Internet Research, 13 (4) DOI: 10.2196/jmir.2012

[2] Tweetation – twitter citation eg for seven days tw7. (skewed  by publication date)

[3] twimpact factor – TWIF7 = cumulative number of tweetations 7 days after publication

pf button New research plus twitter. Does it make a difference in the clinic.

New analytics platform – deep diving into twitter

With over 55 billion tweets sitting in their database and more being added in real time, twitter provides a lot of data mining opportunities – both for brands and for research. If you are at all interested in analytics this new offering from PeopleBrowsr is worth having a look at:

  • Real time twitter search – over 3 years worth of data segmented by location, community, interests, gender, sentiment…..
  • Deeper diving with viral analytics platform
  • Engagement platform – allows searches (like tweetdeck) AND a separate column of charts based on the live search data being pulled
  • Generate raw data for export
  • Collect info in a personalised ‘playground’

 

pf button New analytics platform   deep diving into twitter

8 Google Tools by Hubspot

If you want an overview of some of the things that Google offers, Hubspot have an excellent ebook.  If you don’t know about Hubspot – they are offer a ton of  free Hubspot resources and clear explanations on business use of social media as well as software.  Below is one such example.

Download (PDF, 1.89MB)

pf button 8 Google Tools by Hubspot

Jitterjam – measuring conversations

JitterJam1 300x212 Jitterjam   measuring conversationsJitterJam combines ‘social media monitoring, an intelligent contact database and a multi-channel digital marketing platform into a single, integrated Social CRM system’.

It is the measuring aspect, rather than the marketing, that I am interested in. Maybe this is a way that I can start looking at whether dissemination of health research has any effect on clinical practice – or are we just having a chat online and not changing clinical practice at all?

 

 

 

 

pf button Jitterjam   measuring conversations

Introducing Research.ly, Analytic.ly and PeopleBrowsr

People I have shown Research.ly to are often shocked.  “What?! It has the data on every tweet?” Yes. Everything you have put out there. Over three years worth. And it can take that and slice and dice it anyway you want:  positive or negative sentiment, gender, location, top retweeters, who made the first tweet on a trending topic. It is one of the new platforms developed by the team at PeopleBrowsr.

When I showed it to one clinical researcher in the office the response was ‘I am never going on twitter, ever.’  ‘People presume they are having a private conversation, albeit in a social space – they don’t expect to be tracked, comments analyzed and remembered’.  Well, what goes online stays online and these are powerful platforms.

Research.ly

What does Research.ly do? It is a search engine and analytic tool for social media and it’s very good for those topics that get a lot of conversations happening. For example a new clothing store – Zara, has just opened in Sydney – and instantly you can find out who is talking about it, where, and top proponents of the store.

Zara Introducing Research.ly, Analytic.ly and PeopleBrowsr

Viral Analytics Platform

Let’s take that a step further.  Research.ly translates that data as a Viral Analytics Platform. You can just as easily search for a peak in conversation over the last three years and find out what was being said using the datamine.  Take for example the case of the dating site eharmony.  They had a spike of online conversation over a year ago in January 2010.

Click on the spike and it takes Research.ly seconds to find the conversations about a discrimination lawsuit against eharmony (for excluding gays and lesbians).  Using the datamine, eharmony can see exactly what was said, who said it and the sentiment. Top right of the screen you can search other topics like this years South by South West.

Viral Analytics Introducing Research.ly, Analytic.ly and PeopleBrowsr

The platform also slices and dices the conversations by community and builds a score card – so that you can see who the top 15 communities are that are positively mentioning a brand.  The communities are defined by looking at people’s twitter bios – in other words how people define themselves.

And then it takes the top three communities and finds those people within them who are the very top positive influencers for the brand.  What does that mean? That a company can find it’s top advocates – champions.  These people will create a good online conversation for them – and that worth a lot of bucks to some companies.

Customise Viral Analytics

Let’s take that a step further. The Viral Analytics platform can be customised – in this case CocaCola get realtime data of the conversations across twitter, facebook (pages), blogs… for any number of accounts and keywords and they can also export the data. They can do the same for the competition, in this case Pepsi, RedBull,  Dr Pepper. The team have also just added (this is very new) – a workspace tab -  you can look at who is tweeting within your company.  This is a very powerful tool indeed.

Viral Analytics coke Introducing Research.ly, Analytic.ly and PeopleBrowsr

So, PeopleBrowsr platforms – real-time data, a social search engine.  That remembers, tracks and analyses.

pixel Introducing Research.ly, Analytic.ly and PeopleBrowsr
pf button Introducing Research.ly, Analytic.ly and PeopleBrowsr

Page 1 of 612345»...Last »