๐ What is Scientometrics and Informetrics?
Scientometrics and Informetrics are quantitative disciplines that study the structure, properties, and dynamics of science and information using mathematical and statistical methods. They provide objective insights into research patterns, scientific communication, and knowledge production across disciplines.
๐ฌ Core Definitions
- Scientometrics: The quantitative study of science, scientific development, and scientific policy using bibliometric and other quantitative methods
- Informetrics: The broader field encompassing quantitative aspects of information in any form, including bibliometrics, scientometrics, webometrics, and altmetrics
- Bibliometrics: Statistical analysis of books, articles, and other publications to understand patterns of publication, authorship, and citation
- Citation Analysis: Examination of the frequency, patterns, and graphs of citations in documents and literature
๐ Evolution and Current Trends (2024-2025)
The field has evolved significantly with technological advancement:
- Big Data Analytics: Processing millions of publications and citations in real-time
- Machine Learning Integration: Automated classification and prediction of research trends
- Altmetrics Expansion: Social media mentions, downloads, and online attention metrics
- Open Science Metrics: Measuring impact of open access and open data initiatives
- Real-time Assessment: Dynamic tracking of research impact and collaboration patterns
๐ฏ Key Application Areas
๐ Research Assessment
Purpose: Evaluate research quality, impact, and performance
Applications:
- Individual researcher evaluation
- Journal impact assessment
- Institutional ranking systems
- Grant funding decisions
- Tenure and promotion processes
๐ Science Mapping
Purpose: Visualize and understand the structure of scientific fields
Applications:
- Research trend identification
- Interdisciplinary analysis
- Collaboration network mapping
- Knowledge domain visualization
- Research gap discovery
๐ฑ Altmetrics
Purpose: Measure broader societal impact of research
Applications:
- Social media engagement tracking
- Policy document citations
- Media coverage analysis
- Educational resource usage
- Public engagement measurement
๐ Network Analysis
Purpose: Study relationships and connections in science
Applications:
- Co-authorship networks
- Citation networks
- Co-word analysis
- Institutional collaborations
- Knowledge flow patterns
๐ Fundamental Metrics and Indicators
๐ Impact Factor (IF)
Purpose: Measures average citations per article in a journal over two years
Limitations: Subject field variations, time window constraints, manipulation potential
๐ h-index
Purpose: Balances productivity and impact for individual researchers
Advantages: Single number summary, robust to outliers, career-long measure
๐ฏ Field-Weighted Citation Impact (FWCI)
Purpose: Normalizes citations by subject field, publication year, and document type
Interpretation: 1.0 = world average, >1.0 = above average, <1.0 = below average
๐ Relative Citation Ratio (RCR)
Purpose: NIH's field-normalized citation metric
Features: Uses co-citation networks for field definition, updated annually
๐ Introduction to Citation Analysis
Citation analysis is the examination of the frequency, patterns, and graphs of citations in documents and literature. It serves as a fundamental method in scientometrics for understanding scholarly communication, research impact, and knowledge structures within and across scientific disciplines.
๐ฏ Core Principles of Citation Analysis
- Citation as Quality Indicator: Assumes cited works have contributed to citing work
- Citation as Influence Measure: Citations reflect intellectual influence and knowledge transfer
- Citation Networks: Documents are connected through citation relationships
- Temporal Patterns: Citations show knowledge evolution and aging
- Disciplinary Variations: Citation practices vary across fields
1Types of Citation Analysis
๐ Direct Citation Analysis
- Forward Citations: Papers citing the target paper
- Backward Citations: Papers cited by the target paper
- Self-Citations: Author or journal self-referencing
- Citation Counts: Raw number of citations received
๐ Co-citation Analysis
- Document Co-citation: Two papers cited together
- Author Co-citation: Two authors cited together
- Journal Co-citation: Two journals cited together
- Strength Measure: Frequency of co-citation occurrence
๐ Bibliographic Coupling
- Shared References: Papers sharing common citations
- Coupling Strength: Number of shared references
- Contemporary Similarity: Reflects current research similarity
- Network Formation: Creates research front networks
๐ Citation Context Analysis
- Citation Function: Reason for citing (support, contrast, method)
- Citation Sentiment: Positive, negative, or neutral citation
- Citation Location: Position within the citing paper
- Citation Density: Citations per page or section
2Citation Databases and Sources
๐ Major Citation Databases
Web of Science (WoS)
- Coverage: 21,000+ journals across disciplines
- Time Span: 1900-present (varies by database)
- Strengths: Long historical coverage, citation tracking
- Data: Science Citation Index, Social Sciences Citation Index
Scopus
- Coverage: 28,000+ journals, broader source types
- Time Span: 1970-present
- Strengths: Larger coverage, author profiles, metrics
- Features: CiteScore, SJR, SNIP indicators
Google Scholar
- Coverage: Comprehensive, includes grey literature
- Access: Free, web-based interface
- Limitations: Quality control issues, duplicate records
- Benefits: Broad coverage, real-time updates
Dimensions
- Coverage: 130M+ publications, grants, patents
- Features: Policy citations, clinical trials, datasets
- Access: Free tier available
- Innovation: Links to funding and policy outcomes
3Citation Analysis Workflow
๐ Step-by-Step Citation Analysis Process
Step 1: Define Research Question
Examples:
- What is the citation impact of artificial intelligence research in education?
- How has climate change research evolved through citation patterns?
- Which authors are most influential in machine learning?
Step 2: Database Selection and Search Strategy
Step 3: Data Collection and Cleaning
- Download citation data (full records with cited references)
- Remove duplicates and irrelevant records
- Standardize author names and institutional affiliations
- Validate publication years and document types
Step 4: Citation Network Construction
- Create citation matrices (citing ร cited)
- Build co-citation networks
- Calculate bibliographic coupling strengths
- Apply network analysis techniques
๐ป R Tools for Scientometric Analysis
R is a powerful statistical programming language with numerous packages specifically designed for bibliometric and scientometric analysis. It provides comprehensive tools for data manipulation, statistical analysis, network analysis, and visualization of citation data.
๐ฏ Key R Packages for Scientometrics
- bibliometrix: Comprehensive bibliometric analysis and visualization
- igraph: Network analysis and graph visualization
- tidyverse: Data manipulation and visualization ecosystem
- VOSviewer: R interface for VOSviewer clustering and visualization
- scholar: Google Scholar data extraction and analysis
- RefManageR: Bibliography and citation management
1Getting Started with bibliometrix
๐ Installation and Setup
install.packages(c("bibliometrix", "igraph", "tidyverse", "networkD3", "visNetwork"))
# Load libraries
library(bibliometrix)
library(igraph)
library(tidyverse)
# For interactive analysis
biblioshiny() # Launches web interface
Data Import and Conversion
wos_data <- convert2df(file = "wos_data.txt", dbsource = "wos", format = "plaintext")
# Import Scopus data
scopus_data <- convert2df(file = "scopus_data.bib", dbsource = "scopus", format = "bibtex")
# Basic data overview
dim(wos_data) # Dimensions of dataset
names(wos_data) # Variable names
2Descriptive Analysis
Basic Bibliometric Analysis
results <- biblioAnalysis(wos_data, sep = ";")
# Summary of main information
summary(results, k = 10, pause = FALSE)
# Plot results
plot(x = results, k = 10, pause = FALSE)
Most Cited Papers and Authors
most_cited <- citations(wos_data, field = "article", sep = ";")
cbind(most_cited$Cited[1:10])
# Most productive authors
most_productive <- authorProdOverTime(wos_data, k = 10, graph = TRUE)
# H-index calculation for authors
h_index <- Hindex(wos_data, field = "author", elements = "SMITH J",
sep = ";", years = 10)
3Network Analysis and Visualization
๐ Co-authorship Network Analysis
coauth_network <- biblioNetwork(wos_data, analysis = "collaboration", network = "authors", sep = ";")
# Network statistics
net_stat <- networkStat(coauth_network)
summary(net_stat, k = 10)
# Visualize network net_plot <- networkPlot(coauth_network, normalize = "association", weighted = T, n = 50, Title = "Co-authorship Network", type = "fruchterman", size = T, remove.multiple = F)
Co-citation and Coupling Networks
cocit_network <- biblioNetwork(wos_data, analysis = "co-citation", network = "references", sep = ";")
# Bibliographic coupling
coupling_network <- biblioNetwork(wos_data, analysis = "coupling", network = "references", sep = ";")
# Historical citation network
histnet <- histNetwork(wos_data, min.citations = 10, sep = ";")
4Advanced Analysis Techniques
Conceptual Structure Mapping
coword_network <- biblioNetwork(wos_data, analysis = "co-occurrences", network = "keywords", sep = ";")
# Conceptual structure via MCA
conceptual_map <- conceptualStructure(wos_data, field = "ID", method = "MCA", minDegree = 4, clust = 5, stemming = FALSE, labelsize = 10, documents = 20)
# Thematic evolution
thematic_evolution <- thematicEvolution(wos_data, field = "ID", years = c(2020, 2022, 2024), n = 250, minFreq = 2)
Science Mapping with Multiple Correspondence Analysis
keyword_net <- biblioNetwork(wos_data, analysis = "co-occurrences", network = "author_keywords", sep = ";")
# Clustering and visualization
cluster_res <- networkStat(keyword_net, stat = "louvain")
# Create interactive network
library(visNetwork)
vis_net <- networkPlot(keyword_net, normalize = "association", n = 100, cluster = "louvain", type = "auto")
5Publication Trend Analysis
๐ Temporal Analysis Example
annual_prod <- annualProduction(wos_data)
print(annual_prod)
# Author production over time
author_time <- authorProdOverTime(wos_data, k = 10, graph = TRUE)
# Top authors' production over time
library(ggplot2)
ggplot(author_time, aes(x = year, y = freq, color = Author)) +
geom_line(size = 1) +
geom_point(size = 2) +
theme_minimal() +
labs(title = "Author Production Over Time", x = "Year", y = "Number of Publications")
๐ VOSviewer for Science Mapping
VOSviewer is a software tool for constructing and visualizing bibliometric networks. It can be used to create maps of authors, publications, journals, keywords, or terms based on co-occurrence, citation, bibliographic coupling, or co-authorship relations.
๐ฏ VOSviewer Key Features
- Network Visualization: Create and display large-scale bibliometric networks
- Clustering Algorithm: Advanced community detection using modularity optimization
- Multiple View Options: Network, overlay, and density visualizations
- Interactive Interface: Zoom, pan, and explore network details
- Data Export: High-quality images and network data export
- Database Integration: Direct import from Web of Science, Scopus, PubMed
1Getting Started with VOSviewer
๐ Installation and Setup
- Download: Free from www.vosviewer.com
- Requirements: Java Runtime Environment (JRE) 8 or higher
- Platform: Windows, macOS, Linux compatible
- Version: Current version 1.6.19 (as of 2024)
System Requirements
- Memory: Minimum 4GB RAM (8GB+ recommended for large datasets)
- Java Heap Space: Increase for large networks (java -Xmx4g -jar VOSviewer.jar)
- Display: High resolution recommended for network visualization
2Creating Networks from Database Files
Web of Science Import
- Export Data: Download "Full Record and Cited References" in Plain Text format
- VOSviewer Import: Create map from bibliographic database files
- File Selection: Choose .txt file from Web of Science
- Analysis Type: Select co-authorship, co-occurrence, citation, or coupling
1. File โ Create โ Map from bibliographic database files
2. Select Web of Science file (savedrecs.txt)
3. Choose "Co-authorship" analysis
4. Set unit of analysis: "Authors"
5. Set minimum number of documents: 5
6. Select authors to include (typically top 100-500)
Scopus Import Process
- Output: CSV Export
- Information to include: Citation information, Bibliographical information, Abstract & keywords
- File format: CSV (Comma separated)
VOSviewer Import:
1. Create โ Map from bibliographic database files
2. Select Scopus CSV file
3. Choose analysis type and unit
4. Apply thresholds and filters
3Network Analysis Options
๐ฅ Co-authorship Networks
- Authors: Network of collaborating researchers
- Organizations: Institutional collaboration networks
- Countries: International research collaboration
- Metrics: Total link strength, cluster membership
- Minimum 2-5 documents per author
- Focus on top 100-500 authors
- Use normalization for fair comparison
๐ Co-occurrence Networks
- Author Keywords: Co-occurrence of author-provided keywords
- Index Keywords: Database-assigned subject terms
- Title Terms: Terms extracted from publication titles
- Abstract Terms: Terms from abstracts and full text
- Stemming and lemmatization options
- Stop word removal
- Manual term cleaning and synonyms
๐ Citation Networks
- Document Citations: Which papers cite which papers
- Source Citations: Journal citation patterns
- Author Citations: Citation relationships between authors
- Temporal Analysis: Citation evolution over time
- Citation count and normalized scores
- Average publication year
- Average citations per publication
๐ Bibliographic Coupling
- Document Coupling: Papers sharing common references
- Author Coupling: Authors with overlapping reference lists
- Source Coupling: Journals with similar reference patterns
- Strength Measure: Number of shared references
- Research front identification
- Contemporary similarity mapping
- Subject area delineation
4Visualization and Interpretation
๐ Network Visualization Options
๐ Network View
Purpose: Shows the structure of the network with nodes and connections
- Node Size: Based on weight (citations, occurrences)
- Link Thickness: Strength of relationship
- Colors: Cluster membership or scores
- Layout: Force-directed algorithm for optimal positioning
๐ฏ Overlay View
Purpose: Overlays additional information on the network structure
- Average Publication Year: Temporal evolution visualization
- Average Citations: Impact visualization
- Custom Scores: User-defined metrics overlay
- Color Gradient: Continuous scale representation
๐ Density View
Purpose: Shows the density of items in different areas of the map
- Hot Spots: Areas with high concentration of items
- Smooth Visualization: Continuous density surface
- Core Areas: Identification of research concentrations
- Periphery Detection: Emerging or specialized topics
5Advanced Features and Customization
Clustering and Community Detection
- Resolution: Controls cluster size (default 1.0)
* Higher values โ smaller clusters
* Lower values โ larger clusters
- Minimum cluster size: Minimum items per cluster
- Clustering algorithm: Modularity-based optimization
Manual Cluster Refinement:
- Move items between clusters
- Merge or split clusters
- Custom cluster colors and labels
Network Statistics and Metrics
- Number of items and links
- Total link strength
- Average cluster coefficient
- Network density
- Average path length
- Centrality measures (per item)
Export Options:
- Network data (GML, GraphML, Pajek)
- Map images (PNG, EPS, SVG)
- Item and cluster statistics
- Coordinate information
๐ Real-World Applications
Explore practical applications of scientometrics and informetrics across different domains and research scenarios.
๐ฌ Research Evaluation
Assessment of individual, institutional, and national research performance
๐ Science Policy
Evidence-based science and technology policy development
๐บ๏ธ Science Mapping
Visualization and analysis of knowledge domains and research fronts
๐ฎ Trend Prediction
Forecasting emerging research areas and technological developments
๐ ๏ธ Tools and Resources
Comprehensive collection of tools, databases, and resources for scientometric analysis.
๐ป Analysis Software
Software tools for bibliometric and scientometric analysis
๐ Databases
Citation databases and data sources
๐ Learning Resources
Books, courses, and educational materials
๐ Journals & Communities
Academic journals and professional communities
