Gephi Tutorial: How to use it for Network Analysis?

If you would like to get your hands dirty with some ONA software, we have prepared a simple Gephi tutorial to help you do basic organizational network analysis on a sample dataset. When you do it yourself, you get a better understanding of the logic of the analysis, the opportunities and limitations this open-source software provides, and a more meaningful interpretation of results, by using your context knowledge to better understand what the network statistics mean for the organization.

What is Gephi?

Gephi is one of the most popular open-source software for network analysis. It is praised especially for the network visualizations it can produce, ranging from beautiful network art to very technical network graphs.  

What is Gephi good for?

Gephi is good for beginners in network analysis. The point-and-click software can handle basic and advanced (through plugins) network analytics and customizable graph visualizations. One can analyze network structures, communities, and key actors, and can design network graphs on static and dynamic networks, geo-located data, as well as multimode/multiplex networks. However, Gephi is more limited analytically than software available in programming, such as R or Python. At the moment, network analysis in R is the best-developed software toolkit for both descriptive and inferential analyses.


Our FULL GEPHI TUTORIAL includes a complete list of summary action points for precise network analysis in Gephi!


Gephi Tutorial: Introduction to network analysis with the software

Nodes and Edges

Before discussing the software, let’s first discuss the data. In network analysis, we speak about nodes and edges. Nodes are the entities that are connected. They can be people, business units, organizations, skills, knowledge, countries, etc. In our case, nodes will be people. Edges are the connections between the nodes, which, in our case, are work-related interactions of different types (e.g., communications, management, informal ties, etc.).

network
Figure 1. One-mode, directed, multiplex network

Questions and data

If you collect data through a questionnaire, you might ask questions such as „Who do you communicate the most with?” or „Whom do you trust the most at work?”. These questions are relational and will provide you with data on how people are connected through strong communication ties or through relations of interpersonal trust.

With OrgMapper | INFLUENCE, we use a complex questionnaire with 18 questions on which employees of an organization nominate their peers, measuring five different dimensions of influential people: communication skills, leadership capabilities, the potential for mobilization, change readiness, and interpersonal trust. The nomination process is facilitated by an easy-to-use online platform, accessible in all digital formats, as well as offline.

sample questions
Figure 2. Sample questions from OrgMapper Influence Survey

Data files

Data should be organized in two spreadsheets: (1) edgelist, (2) node attributes. These files work best if saved as .csv.

data files
Figure 3. Save as .csv in Excel

The edgelist contains a minimum of two columns: (a) Source, (b) Target. Any relevant edge attribute can be added (e.g., Date, Type, etc.). Source is the respondents, Target is the nominated individuals. Of course, some people will be both a Source, when they nominate others and a Target, when they are nominated by other respondents. One row in the spreadsheet will mean one interaction between two people, A and B. We thus end up with one mode, directed, weighted network, meaning our data connects only one type of node – people, the direction of the nomination matters – A nominated B, and the more one person nominated the same another person on different interaction types, those nominations cumulate (i.e., A nominated B on questions of communication, management, and informal ties).

edgelist
Figure 4. Sample edgelist format

The node attributes file contains the node IDs (unique identifiers) and the characteristics of the people (e.g., hierarchy level, location, business unit, department, gender, years in service, etc.).

node attributes
Figure 5. Sample node attributes format

We will use the edge weight to show, with the thickness of the link, the strength of the relationship between two people at work, and we will use the node attributes to color the nodes according to specific characteristics (e.g., different colors for gender, location or hierarchy level).

hierarchy level
Figure 6. Example color-coding by hierarchy level


Our FULL GEPHI TUTORIAL includes a complete list of summary action points for precise network analysis in Gephi!


Install Gephi

To install Gephi, go to www.gephi.org. Check your system’s requirements and download the version specific to your operating system. The latest version of Gephi is 0.9.3, and it doesn’t require Java installation like previous versions required. 

Add data to Gephi

Once  installed and opened, you’ll get a first pop-up window:

Gephi
Figure 7. Click New Project on Gephi Open

Click New Project, to activate the Data Laboratory Tab.

new project
Figure 8. Data Laboratory - Import Spreadsheet

Click on Data Laboratory and then on Import Spreadsheet. If you have a missing Data Table window, like the one shown in the screenshot above, go to the main menu: Window 🡪 Data Table, and then the Import Spreadsheet button should be available.

data laboratory
Figure 9. Edgelist preview window
data import
Figure 10. Preview window data import

In the figure above, pay attention to the following: whether Gephi recognizes your network as directed or undirected. Check if the number of nodes and edges seem correct. Choose the edge merge strategy (i.e., if you have one person nominating the same person on each question, Gephi will add (using Sum) the number of nominations A gave to B. Choose append the existing workspace, to add data in the workspace opened.

Gephi
Figure 11. Preview of the edgelist imported into Gephi (Edges Tab)

To import the node attributes file, go to Data Laboratory 🡪 Import Spreadsheet 🡪 Choose node attributes file.

files
Figure 12. Importing Node Attributes file
data table
Figure 13. Preview node attributes imported

To add labels to the nodes, follow the steps: In Data Table, click on Nodes 🡪 Copy Data to Other Column 🡪 Choose Id 🡪 Choose Label 🡪 press Ok.  

labels
Figure 14. Adding labels to nodes

Get started with analysis

statistics
Figure 15. Overview --> Statistics

Start calculating network statistics in Overview, such as Average Degree, Network Diameter, Modularity, or Page Rank. The statistics at the individual level, such as PageRank and other centrality measures, Modularity, etc., will be added to Data Laboratory for each node. Statistics calculated at the entire network level will appear in the Overview Window, next to each statistical indicator calculated (e.g., Clustering Coefficient, Modularity, etc.).

Centrality measures can be used to generate node rankings, showing the hierarchy of popularity, attention, or influence in the people’s network. The higher the score, the more centrally connected that node is.

  

Get started with visualizing

The choice of algorithm is a combination of explorative analysis and crafting a well-delivered message to the reader. The network graph’s final appearance should focus on delivering one or two key messages. For example:

visualizing
Figure 16. Gender and In-Degree

Blue = male

Yellow = female

Node size = the number of nominations received from peers. Larger node size, more nominations received

Gephi Tutorial: How to explore and visualize your data?

Dr. Silvia Fierăscu, Head of OrgMapper | Academy, recommends always exploring the data before delving into a more systematic analysis, to get you familiar with what the data mean, from an analytic and theoretical point of view, to get you familiar with the statistics and data transformations available, and to get you exercising different layout visualizations to arrive at meaningful visuals of the results.   

No single indicator will give enough insights into the complexity of the ecosystem analyzed. So use a mosaic of network statistics, at all three levels, to paint a broader picture, question assumptions, and understand positions, and the opportunities and constraints they entail.

Layouts

For large networks, anything above 100 nodes, use an organic layout algorithm, such as ForceAtlas2. ForceAtlas2 is a spring embedder that shrinks the distance among nodes that are highly connected and pushes further apart nodes that are less connected.

For small networks, choose algorithms such as Fruchterman Reingold, that equalize the distances among nodes.


To find out more, about layouts, download our full guide!


Colors

In Gephi, colors are a medium to communicate messages about the meaning of the complex ecosystem explored. One can use colors as node attributes, to show different hierarchy levels, locations or groupings, as categorical attributes. Colors can also be used to visually rank nodes, with stronger intensity colors for higher node statistics, and weaker intensity colors for lower node statistics. To find out more, about colors, receive our full guide in your inbox!

Filters

Filters are a great tool to zoom in and out of different pockets of the network. If your Filters button is missing, add it from the main menu Window button. To find out more, about colors, receive our full guide in your inbox!

Statistics

Here are a few action points: Explore network structures and network-level statistics, Explore network communities and Explore key nodes. If you want more detailed steps in each action point, receive our full guide in your inbox!

Gephi tutorial final steps: save and share

Finally, after you have explored the data at the network, community, and individual levels, after you have contextualized the results and understood the main message(s) to be delivered through a network graph, click on Preview and finalize the network visualization for the export.

Nota Bene: In Gephi, all parameter modifications in the Preview window need to be followed by clicking on the Refresh button to be seen on the screen.

The default visualization keeps the colors of the nodes and edges like in the Overview but has curved edges. If you’d like to visualize straight edges, choose the Default Straight button in the Presets drop-down menu. For more specifics and tips regarding saving your work, receive our full guide in your inbox!

Thank you for reading through. If you’d like to learn more about how to use Gephi for Organizational Network Analysis, join us for the Global ONA Mentorship Program, every March, July and October.

Our FULL GEPHI TUTORIAL includes a complete list of summary action points for precise network analysis in Gephi!

Stay Tuned

Sign up to our newsletter to stay in the loop about upcoming events, learning opportunities, product updates and many more!
Subscribe!

Get Started Today

We created a fast lane for you, Whenever you are ready, you can easily get in touch with us! We are literally one click away!

FAST LANE
People Analytics
Business Culture Awards
Business Culture Awards 2021
Change Management
envelopecross linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram
OrgMapper
Privacy Overview

This website uses cookies so that we can provide you with the best user experience possible. Cookie information is stored in your browser and performs functions such as recognising you when you return to our website and helping our team to understand which sections of the website you find most interesting and useful.