Skip to main content
SAGE
Search form
  • 00:05

    [Creating an Online Tool for Networkand Text Analysis, Netlytic][MUSIC PLAYING]

  • 00:15

    PHILIP MAI: The Social Media Lab was started in 2010at Dalhousie University in Nova Scotia by Anatoliy.[Philip Mai, MA, JD, Director of Business and Communications,Social Media Lab] I came on board shortlyafter to help him with basically raising the researchprofile of the lab because I knew the kind of questionsand the kind of work that they were doingis important and that more people

  • 00:37

    PHILIP MAI [continued]: needed to be aware of it.So that's how I got involved.

  • 00:40

    ANATOLIY GRUZD: You need the plus.So as a director of research in the Social Media Lab,I'm responsible for developing the research agenda,supervising students and essentially executingresearch projects that we currently have in the Lab.[Anatoliy Gruzd, PhD, Director of Research, Social Media Lab]So let's use Netlytic to the study of this topic.So one of the tools that we use in the Social Media Lab

  • 01:01

    ANATOLIY GRUZD [continued]: to study different online communities is called Netlytic.Netlytic was developed 10 years agoto study forum-based discussions.This is when social media, platformslike Twitter and Facebook, were not that popular.They were just about to launch themselves.And so most of the internet userswould turn to forum discussions to discuss different topics,

  • 01:24

    ANATOLIY GRUZD [continued]: create support groups, and there was a really needfrom social science side to understandhow those groups operate, what actually draws people in,and how do we recreate and sustainhealthy online discussions.And so as part of my background dissertation work,I develop a simple tool to first understand forum discussions.

  • 01:45

    ANATOLIY GRUZD [continued]: And because I was really interested in networkstructures of our society and how network structuresinfluence-- potential influence our actions and--form new connections, I really wantedto understand how forum discussions canbe turned into network visualizationsand then use those network visualizations, network graphs,

  • 02:06

    ANATOLIY GRUZD [continued]: to understand the group dynamics.So this is really origins of Netlytic,a tool that was designed to just study forum discussions.But what happened is that social media became really popular,and a lot of colleagues-- my colleagues in the areaof online communities started emailing me and asking, wellcan, we use your tool?We saw you using it to analyze forum discussions.

  • 02:29

    ANATOLIY GRUZD [continued]: Can we use it to study something like Twitter,something like discussions on the blog post?And so over the years, because of this high demandfrom colleagues in social science mostly fields,I started expanding Netlytic as a toolto allow new data sets and new platformsto be able to analyze.

  • 02:51

    PHILIP MAI: When Anatoliy first released Netlytic,it was something that you had to download.And the problem with that was that people were operatingwith different operating system, different hardware at home,so we were getting too many how do I questions with Netlytic.And we knew that we didn't have the manpowerto handle all-- the how do I questions.

  • 03:12

    PHILIP MAI [continued]: So we knew that we had to modify that softwareand make it something that only exists in the cloud.So this way there's one version of it,and we test it on x number of browsers so this waywe can tell people ahead of time if youwant to use the software--But that's something that we learnedand we iterate from just talking to people.And then we also realized that wecan't make the tool that complicated

  • 03:34

    PHILIP MAI [continued]: because it was originally designed for researchers.But there are regular people, marketers and others, whowant to use the tool, students.So we realized, oh, we need a better FAQ.We need add additional features into the toolsto make it more useful, so this way is foolproof.

  • 03:53

    DEENA ABUL-FOTTOUH: I'm a post doctoral researcher here.My background is in political sociology theory.And I analyzed Twitter networks of the Egyptian Revolutionover different time periods.[Deena Abul Fottouh, PhD, Post Doctoral Researcher,Social Media Lab] And that what introduced me to social networkanalysis because this type of research you cannot study itwith the regular methods of sociology.

  • 04:17

    DEENA ABUL-FOTTOUH [continued]: My experience with the Lab was through their conferences,so I used to attend their conference.They have a conference every other year here in Toronto.So I used to attend these conferences.I found that the subjects that they doand the different studies that they makeis pretty much close to my research interests.And because they do have Netlytic,

  • 04:39

    DEENA ABUL-FOTTOUH [continued]: which is like the software that analyzes social media--social media data and I'm working with social media data,so that was pretty interesting for me.

  • 04:48

    ANATOLIY GRUZD: So we really spend a lot of timedoing usability studies and understandinghow social science researchers would go about their analysis.And so we quickly realized that wehave to avoid as much as possible end-year requirementsfor them to install something or to a null computer programminglanguage, any scripts.So the decision was to made it web based, so cloud based,

  • 05:11

    ANATOLIY GRUZD [continued]: where you can essentially access it from any computeras long as you have internet connectionand make sure we reduce as many clicks as possiblefrom the point where researchers say, OK, Iwant to study this topic or I want to study this groupand to the point where they actually getthe result of their analysis.First time you look into Netlytic,

  • 05:32

    ANATOLIY GRUZD [continued]: essentially you'll see a screen of all of the different datasets that you've already capturedin the data available for analysis.As you can see, I have quite a few datasets that we're using for work here in the Lab.If you want to capture new data sets,you go into a new data sets page,and so now you have to figure out what

  • 05:54

    ANATOLIY GRUZD [continued]: platforms you want to study.Some researchers start with idea,OK, I want to study Instagram and that'swhen their data will come from.And some researchers interested in studying a topic.So if you studying a topic of discussion--particular issue may happen across multiple platforms.So you might want to set out one data set for Twitter,one for Facebook pages, YouTube, and so on.

  • 06:17

    ANATOLIY GRUZD [continued]: It's very actually similar to other analytics typeof platforms where first you haveto think about what's the input.What type of data is on the input?And in our case, because we are interested in studyingonline communities that are based on social mediaplatforms, so the input for us would be conversational data.And it's-- for us, it's only publicly available conversation

  • 06:38

    ANATOLIY GRUZD [continued]: on data that we're studying.And so then once you get input into the system,then the second question is how do you clean it.So there are different processes thatare responsible for formatting the data in a proper way,preparing essentially for analysis, removing duplicates,and so on.

  • 06:58

    ANATOLIY GRUZD [continued]: And so on the data is in-- so input stage--and pre-processed, then the systemperforms different forms of analysis.And so for our area of research, two most popular formof analysis emerged--text analysis and network analysis.Let's go to the next analysis, and it'scalled manual categories.

  • 07:20

    ANATOLIY GRUZD [continued]: So it's part of text analysis, but essentially itaccounts for dictionaries of words that you want to analyze.And in our case, let's look whether peoplesaying something positive or negative about the university.And so we just simply hit visualize,and it brings up this word tree map visualization.

  • 07:44

    ANATOLIY GRUZD [continued]: And so usually you will have lots of different piecesbecause for this analysis, we only use two categories.And one was feeling good, and the other is feeling bad.And as you can see, the feeling bad categoryis very small, so that's why it's almost invisible.So most of the keywords appearing for this datasets coming from the positive sentiments,

  • 08:06

    ANATOLIY GRUZD [continued]: and these have messages with the wordgreat, excited, proud, good, healthy, happy.And if you want to see the actual messages, what peopleare happy about, you just click on the visualizationand so you can read the actual message.And so in this case it's sweet.People say happy to have been part of this awesome panel.

  • 08:28

    ANATOLIY GRUZD [continued]: Text analysis really helps you to understandthe topics emerged in a group discussion and maybe trends--study trends over time.And network analysis complements that.It shows you who engaged in the conversation, how frequentlythey stayed in the conversations,and how much influence do they haveon other community members.Once you finish text analysis, then you

  • 08:50

    ANATOLIY GRUZD [continued]: can move to network analysis.And the difference is that it allows you to figure outwho are the actors, stakeholders, involvedin conversations and how much influence do theyhave on this conversation.The most common networks that we can analyze coming from Twitterare called name networks or who mentions whom networks.

  • 09:14

    ANATOLIY GRUZD [continued]: Essentially, it shows who's re-tweeting, mentioning,or replying to whose messages in this aggregated form.So while smaller assembled data set of 1,000 Twitter messages,we see there are 650 unique Twitterusers who were engaged in this conversation,and they all collaboratively created

  • 09:35

    ANATOLIY GRUZD [continued]: nearly 3,000 connections.

  • 09:37

    DEENA ABUL-FOTTOUH: The biggest thing about Netlytic is the--I haven't seen any other softwarethat can grab you social media data from different platforms.The visualization is pretty good.You can look at different networks.You cannot do that without a tool or a software that is

  • 09:57

    DEENA ABUL-FOTTOUH [continued]: designed to visualize large networks because fromthe visualization, you can detect the differentcommunities you have in the network.And also it has some of the basic statistics or metricsfor networks that you can start with and lead youmore into further analysis of the data.

  • 10:19

    ANATOLIY GRUZD: Let's try to analyzewhat people are talking about when they talk about GeneralData Protection Regulation.And so as you know, the regulationwas implemented by European Unionand was implemented on May 25th of May.And so it's been a couple months since people and organizations

  • 10:41

    ANATOLIY GRUZD [continued]: tried to adapt to this regulations,and what we wanted to see today whatsentiments and topics emerged around these regulations.Do people like it?Do they think it's effective?So let's use Netlytic to study this topic.

  • 10:57

    DEENA ABUL-FOTTOUH: Are we doing only sentiment analysis,or are we also going to do network analysisin terms of who replies to whom, who talks to whom?

  • 11:04

    ANATOLIY GRUZD: That's a good question.Let's start with network analysisbecause it will highlight some of the key playersin this conversation.Who is trying to influence the regulation implementation?Yeah, let's do network analysis.But let's also discuss what social media platforms shouldbe analyzed.Where do you think most of the relevant conversations

  • 11:25

    ANATOLIY GRUZD [continued]: would happen about this regulation?So we have Twitter, Facebook, Instagram,or YouTube as an option.

  • 11:32

    DEENA ABUL-FOTTOUH: Well, usually these typesof conversations, since they go into both political and legalconversations, we find them more on Twitter.

  • 11:39

    ANATOLIY GRUZD: How about Instagram or YouTube.Do you think those are viable options for us?

  • 11:44

    DEENA ABUL-FOTTOUH: Anatoliy, I thinkwe're looking into pictures or videossince we're talking more about the type of repliesor mentions I think, Twitter and-- for the instancethat which I can give you in terms of a userwould be a more suitable one.We can go later to Instagram and YouTubefor sure if we are interested in other types of media.Yeah, yeah, for personal stuff.

  • 12:06

    DEENA ABUL-FOTTOUH [continued]: Yeah, because you don't find much of a person--political and legal discussions on these other platforms.

  • 12:13

    ANATOLIY GRUZD: I agree, so let's go with Twitter.And perhaps we have to talk a little bitabout what keywords we should useto capture relevant messages.General Data Protection Regulation,it seems like GDPR is very commonly used.But let's check first-- before we even collect the data,let's go to Twitter directly and search for GDPR.

  • 12:38

    ANATOLIY GRUZD [continued]: And usually, people would use hashtag, would you say,as a sign that they are talking about this issue.

  • 12:45

    DEENA ABUL-FOTTOUH: Yeah, as opposed to inside the contentof the tweet.

  • 12:48

    ANATOLIY GRUZD: So what we're seeing hereessentially most recent Twitter messagesthat include hashtag GDPR.And what we want to make sure is that all of the messagesare relevant to what we want to study.Let's look at some of the ones that come out here.So there is a--

  • 13:07

    DEENA ABUL-FOTTOUH: So we're looking at the common hashtagsnow if they come with GDPR right?

  • 13:13

    ANATOLIY GRUZD: There is a data securityhashtag, data protection.So I think just to be on the safe side,let's collect Twitter messages that mention hashtag GDPR onlyfor now.And later on, we can I think explorers sub topics.So let's create the data set in Netlytic.So we'll capture any tweets that mention the hashtag GDPR,

  • 13:36

    ANATOLIY GRUZD [continued]: and let's collect it for a period of time.Let's say a week.And we'll start inputting the data.So the data set we have right now,19,000 recent tweets on this regulation.And you can see a preview of tweetswe picked up that are relevant to our criteria.

  • 13:56

    ANATOLIY GRUZD [continued]: But instead of reading 19,000 tweets one at a time,because it would be too time consuming,we wanted to capture a general groups of accounts that haveengaged in this conversation.And I think network analysis is a great method to do so.So you have all this 19,000 tweets

  • 14:16

    ANATOLIY GRUZD [continued]: visualize a graph, and so each dot represents a Twitteraccount that posted or retweeted at least one messagewith the hashtag we are interested in.And you can see there are some outliers that only participatedin the conversations once and really not engagedin key conversations.With some of the technologies that went into designing a tool

  • 14:38

    ANATOLIY GRUZD [continued]: Netlytic has changed over time.So as the web changes and some of the technology'sbecoming more prominent or less prominent, we also adapted it.So at the core, we're using more popular programming languagesfor the web, like php, Python, but on the visualization end,we have--

  • 14:58

    ANATOLIY GRUZD [continued]: look at what's out there, what's popular,and what browsers can support.So a lot of it is based on JavaScript visualizationsso essentially allows any browser to visualize networks.You certainly need to think about what tools and techniquesto use for the analysis.And you can-- what we realized in the development stage

  • 15:20

    ANATOLIY GRUZD [continued]: that there are a lot of various librariesthat can support machine learning, textanalysis, network analysis, but it alsomay become very overwhelming for the end user.And so over the years, instead of expandingthe number of types of analysis you can do with the tool,we actually focus on what actually researchers

  • 15:42

    ANATOLIY GRUZD [continued]: are most interested in and decidedto improve those features that are most in demand,most relevant to our community members.[MUSIC PLAYING]Big data sets essentially allowedus to do as data scientists essentiallyis to study society at scale.It used to be that you can only study a very small sample

  • 16:05

    ANATOLIY GRUZD [continued]: of your population that you're interested in studyingthrough more traditional ways of collecting datalike surveys and interview.So with the emergence of social media dataand especially publicly available social media data,it allows us to study the issues that society is concernedabout at the large scale.

  • 16:27

    ANATOLIY GRUZD [continued]: And it also allows us to see how society and the opinionschanges over time and what are the key playersin this type of movements.

Abstract

Social Media Lab's Philip Mai, MA, JD, Director of Business and Communications, Anatoliy Gruzd, PhD, Director of Research, and Deena Abul-Fottouh, PhD, post-doctoral researcher, discuss the online social network and text analysis software program, Netlytic, including its origins; benefits to social science researchers; how text data is collected, cleaned, and analyzed; combining text analysis with network analysis; a demonstration of how Netlytic works; and how Netlytics keeps up with evolving technology.

Looks like you do not have access to this content.

Creating an Online Tool for Network & Text Analysis: Netlytic

Social Media Lab's Philip Mai, MA, JD, Director of Business and Communications, Anatoliy Gruzd, PhD, Director of Research, and Deena Abul-Fottouh, PhD, post-doctoral researcher, discuss the online social network and text analysis software program, Netlytic, including its origins; benefits to social science researchers; how text data is collected, cleaned, and analyzed; combining text analysis with network analysis; a demonstration of how Netlytic works; and how Netlytics keeps up with evolving technology.

Copy and paste the following HTML into your website