Beta Version

ShodhSarthi

DULS Guide to

๐Ÿ“š Scientometrics and Informetrics

Comprehensive guide to quantitative science studies, citation analysis, and bibliometric tools

๐Ÿ“Š What is Scientometrics and Informetrics?

Scientometrics and Informetrics are quantitative disciplines that study the structure, properties, and dynamics of science and information using mathematical and statistical methods. They provide objective insights into research patterns, scientific communication, and knowledge production across disciplines.

๐Ÿ”ฌ Core Definitions

  • Scientometrics: The quantitative study of science, scientific development, and scientific policy using bibliometric and other quantitative methods
  • Informetrics: The broader field encompassing quantitative aspects of information in any form, including bibliometrics, scientometrics, webometrics, and altmetrics
  • Bibliometrics: Statistical analysis of books, articles, and other publications to understand patterns of publication, authorship, and citation
  • Citation Analysis: Examination of the frequency, patterns, and graphs of citations in documents and literature

๐Ÿ“ˆ Evolution and Current Trends (2024-2025)

The field has evolved significantly with technological advancement:

  • Big Data Analytics: Processing millions of publications and citations in real-time
  • Machine Learning Integration: Automated classification and prediction of research trends
  • Altmetrics Expansion: Social media mentions, downloads, and online attention metrics
  • Open Science Metrics: Measuring impact of open access and open data initiatives
  • Real-time Assessment: Dynamic tracking of research impact and collaboration patterns

๐ŸŽฏ Key Application Areas

๐Ÿ“Š Research Assessment

Purpose: Evaluate research quality, impact, and performance

Applications:

  • Individual researcher evaluation
  • Journal impact assessment
  • Institutional ranking systems
  • Grant funding decisions
  • Tenure and promotion processes
Key Metrics: h-index, Impact Factor, CiteScore, Field-Weighted Citation Impact

๐ŸŒ Science Mapping

Purpose: Visualize and understand the structure of scientific fields

Applications:

  • Research trend identification
  • Interdisciplinary analysis
  • Collaboration network mapping
  • Knowledge domain visualization
  • Research gap discovery
Tools: VOSviewer, CiteSpace, Gephi, Pajek

๐Ÿ“ฑ Altmetrics

Purpose: Measure broader societal impact of research

Applications:

  • Social media engagement tracking
  • Policy document citations
  • Media coverage analysis
  • Educational resource usage
  • Public engagement measurement
Platforms: Altmetric, PlumX, ImpactStory, Dimensions

๐Ÿ”— Network Analysis

Purpose: Study relationships and connections in science

Applications:

  • Co-authorship networks
  • Citation networks
  • Co-word analysis
  • Institutional collaborations
  • Knowledge flow patterns
Methods: Social Network Analysis, Graph Theory, Complex Networks

๐Ÿ“ Fundamental Metrics and Indicators

๐Ÿ“Š Impact Factor (IF)

IF = Citations to articles (n-1, n-2) / Articles published (n-1, n-2)

Purpose: Measures average citations per article in a journal over two years

Limitations: Subject field variations, time window constraints, manipulation potential

๐Ÿ“ˆ h-index

h-index = max{h โˆˆ N : researcher has h papers with โ‰ฅ h citations each}

Purpose: Balances productivity and impact for individual researchers

Advantages: Single number summary, robust to outliers, career-long measure

๐ŸŽฏ Field-Weighted Citation Impact (FWCI)

FWCI = Citations received / Expected citations for field

Purpose: Normalizes citations by subject field, publication year, and document type

Interpretation: 1.0 = world average, >1.0 = above average, <1.0 = below average

๐Ÿ“Š Relative Citation Ratio (RCR)

RCR = Citations per year / Expected citations per year

Purpose: NIH's field-normalized citation metric

Features: Uses co-citation networks for field definition, updated annually

๐Ÿ“ˆ Introduction to Citation Analysis

Citation analysis is the examination of the frequency, patterns, and graphs of citations in documents and literature. It serves as a fundamental method in scientometrics for understanding scholarly communication, research impact, and knowledge structures within and across scientific disciplines.

๐ŸŽฏ Core Principles of Citation Analysis

  • Citation as Quality Indicator: Assumes cited works have contributed to citing work
  • Citation as Influence Measure: Citations reflect intellectual influence and knowledge transfer
  • Citation Networks: Documents are connected through citation relationships
  • Temporal Patterns: Citations show knowledge evolution and aging
  • Disciplinary Variations: Citation practices vary across fields

1Types of Citation Analysis

๐Ÿ“Š Direct Citation Analysis

  • Forward Citations: Papers citing the target paper
  • Backward Citations: Papers cited by the target paper
  • Self-Citations: Author or journal self-referencing
  • Citation Counts: Raw number of citations received

๐Ÿ”— Co-citation Analysis

  • Document Co-citation: Two papers cited together
  • Author Co-citation: Two authors cited together
  • Journal Co-citation: Two journals cited together
  • Strength Measure: Frequency of co-citation occurrence

๐Ÿ”„ Bibliographic Coupling

  • Shared References: Papers sharing common citations
  • Coupling Strength: Number of shared references
  • Contemporary Similarity: Reflects current research similarity
  • Network Formation: Creates research front networks

๐Ÿ“ˆ Citation Context Analysis

  • Citation Function: Reason for citing (support, contrast, method)
  • Citation Sentiment: Positive, negative, or neutral citation
  • Citation Location: Position within the citing paper
  • Citation Density: Citations per page or section

2Citation Databases and Sources

๐ŸŒŸ Major Citation Databases

Web of Science (WoS)
  • Coverage: 21,000+ journals across disciplines
  • Time Span: 1900-present (varies by database)
  • Strengths: Long historical coverage, citation tracking
  • Data: Science Citation Index, Social Sciences Citation Index
Scopus
  • Coverage: 28,000+ journals, broader source types
  • Time Span: 1970-present
  • Strengths: Larger coverage, author profiles, metrics
  • Features: CiteScore, SJR, SNIP indicators
Google Scholar
  • Coverage: Comprehensive, includes grey literature
  • Access: Free, web-based interface
  • Limitations: Quality control issues, duplicate records
  • Benefits: Broad coverage, real-time updates
Dimensions
  • Coverage: 130M+ publications, grants, patents
  • Features: Policy citations, clinical trials, datasets
  • Access: Free tier available
  • Innovation: Links to funding and policy outcomes

3Citation Analysis Workflow

๐ŸŒŸ Step-by-Step Citation Analysis Process

Step 1: Define Research Question

Examples:

  • What is the citation impact of artificial intelligence research in education?
  • How has climate change research evolved through citation patterns?
  • Which authors are most influential in machine learning?
Step 2: Database Selection and Search Strategy
Search Strategy Example (Web of Science): TS=("artificial intelligence" OR "machine learning") AND TS=(education* OR learn* OR teaching) AND PY=(2019-2024) Refined by: Document Types=(ARTICLE)
Step 3: Data Collection and Cleaning
  • Download citation data (full records with cited references)
  • Remove duplicates and irrelevant records
  • Standardize author names and institutional affiliations
  • Validate publication years and document types
Step 4: Citation Network Construction
  • Create citation matrices (citing ร— cited)
  • Build co-citation networks
  • Calculate bibliographic coupling strengths
  • Apply network analysis techniques

๐Ÿ’ป R Tools for Scientometric Analysis

R is a powerful statistical programming language with numerous packages specifically designed for bibliometric and scientometric analysis. It provides comprehensive tools for data manipulation, statistical analysis, network analysis, and visualization of citation data.

๐ŸŽฏ Key R Packages for Scientometrics

  • bibliometrix: Comprehensive bibliometric analysis and visualization
  • igraph: Network analysis and graph visualization
  • tidyverse: Data manipulation and visualization ecosystem
  • VOSviewer: R interface for VOSviewer clustering and visualization
  • scholar: Google Scholar data extraction and analysis
  • RefManageR: Bibliography and citation management

1Getting Started with bibliometrix

๐ŸŒŸ Installation and Setup

# Install required packages install.packages(c("bibliometrix", "igraph", "tidyverse", "networkD3", "visNetwork")) # Load libraries library(bibliometrix) library(igraph) library(tidyverse) # For interactive analysis biblioshiny() # Launches web interface
Data Import and Conversion
# Import Web of Science data wos_data <- convert2df(file = "wos_data.txt", dbsource = "wos", format = "plaintext") # Import Scopus data scopus_data <- convert2df(file = "scopus_data.bib", dbsource = "scopus", format = "bibtex") # Basic data overview dim(wos_data) # Dimensions of dataset names(wos_data) # Variable names

2Descriptive Analysis

Basic Bibliometric Analysis
# Comprehensive bibliometric analysis results <- biblioAnalysis(wos_data, sep = ";") # Summary of main information summary(results, k = 10, pause = FALSE) # Plot results plot(x = results, k = 10, pause = FALSE)
Most Cited Papers and Authors
# Most cited papers most_cited <- citations(wos_data, field = "article", sep = ";") cbind(most_cited$Cited[1:10]) # Most productive authors most_productive <- authorProdOverTime(wos_data, k = 10, graph = TRUE) # H-index calculation for authors h_index <- Hindex(wos_data, field = "author", elements = "SMITH J", sep = ";", years = 10)

3Network Analysis and Visualization

๐ŸŒŸ Co-authorship Network Analysis

# Create co-authorship network coauth_network <- biblioNetwork(wos_data, analysis = "collaboration", network = "authors", sep = ";") # Network statistics net_stat <- networkStat(coauth_network) summary(net_stat, k = 10) # Visualize network net_plot <- networkPlot(coauth_network, normalize = "association", weighted = T, n = 50, Title = "Co-authorship Network", type = "fruchterman", size = T, remove.multiple = F)
Co-citation and Coupling Networks
# Co-citation network cocit_network <- biblioNetwork(wos_data, analysis = "co-citation", network = "references", sep = ";") # Bibliographic coupling coupling_network <- biblioNetwork(wos_data, analysis = "coupling", network = "references", sep = ";") # Historical citation network histnet <- histNetwork(wos_data, min.citations = 10, sep = ";")

4Advanced Analysis Techniques

Conceptual Structure Mapping
# Co-word analysis coword_network <- biblioNetwork(wos_data, analysis = "co-occurrences", network = "keywords", sep = ";") # Conceptual structure via MCA conceptual_map <- conceptualStructure(wos_data, field = "ID", method = "MCA", minDegree = 4, clust = 5, stemming = FALSE, labelsize = 10, documents = 20) # Thematic evolution thematic_evolution <- thematicEvolution(wos_data, field = "ID", years = c(2020, 2022, 2024), n = 250, minFreq = 2)
Science Mapping with Multiple Correspondence Analysis
# Author keyword co-occurrence network keyword_net <- biblioNetwork(wos_data, analysis = "co-occurrences", network = "author_keywords", sep = ";") # Clustering and visualization cluster_res <- networkStat(keyword_net, stat = "louvain") # Create interactive network library(visNetwork) vis_net <- networkPlot(keyword_net, normalize = "association", n = 100, cluster = "louvain", type = "auto")

5Publication Trend Analysis

๐ŸŒŸ Temporal Analysis Example

# Annual scientific production annual_prod <- annualProduction(wos_data) print(annual_prod) # Author production over time author_time <- authorProdOverTime(wos_data, k = 10, graph = TRUE) # Top authors' production over time library(ggplot2) ggplot(author_time, aes(x = year, y = freq, color = Author)) + geom_line(size = 1) + geom_point(size = 2) + theme_minimal() + labs(title = "Author Production Over Time", x = "Year", y = "Number of Publications")

๐Ÿ”— VOSviewer for Science Mapping

VOSviewer is a software tool for constructing and visualizing bibliometric networks. It can be used to create maps of authors, publications, journals, keywords, or terms based on co-occurrence, citation, bibliographic coupling, or co-authorship relations.

๐ŸŽฏ VOSviewer Key Features

  • Network Visualization: Create and display large-scale bibliometric networks
  • Clustering Algorithm: Advanced community detection using modularity optimization
  • Multiple View Options: Network, overlay, and density visualizations
  • Interactive Interface: Zoom, pan, and explore network details
  • Data Export: High-quality images and network data export
  • Database Integration: Direct import from Web of Science, Scopus, PubMed

1Getting Started with VOSviewer

๐ŸŒŸ Installation and Setup

  • Download: Free from www.vosviewer.com
  • Requirements: Java Runtime Environment (JRE) 8 or higher
  • Platform: Windows, macOS, Linux compatible
  • Version: Current version 1.6.19 (as of 2024)

System Requirements

  • Memory: Minimum 4GB RAM (8GB+ recommended for large datasets)
  • Java Heap Space: Increase for large networks (java -Xmx4g -jar VOSviewer.jar)
  • Display: High resolution recommended for network visualization

2Creating Networks from Database Files

Web of Science Import
  1. Export Data: Download "Full Record and Cited References" in Plain Text format
  2. VOSviewer Import: Create map from bibliographic database files
  3. File Selection: Choose .txt file from Web of Science
  4. Analysis Type: Select co-authorship, co-occurrence, citation, or coupling
Example Workflow - Co-authorship Analysis: 1. File โ†’ Create โ†’ Map from bibliographic database files 2. Select Web of Science file (savedrecs.txt) 3. Choose "Co-authorship" analysis 4. Set unit of analysis: "Authors" 5. Set minimum number of documents: 5 6. Select authors to include (typically top 100-500)
Scopus Import Process
Scopus Data Export Settings: - Output: CSV Export - Information to include: Citation information, Bibliographical information, Abstract & keywords - File format: CSV (Comma separated) VOSviewer Import: 1. Create โ†’ Map from bibliographic database files 2. Select Scopus CSV file 3. Choose analysis type and unit 4. Apply thresholds and filters

3Network Analysis Options

๐Ÿ‘ฅ Co-authorship Networks

  • Authors: Network of collaborating researchers
  • Organizations: Institutional collaboration networks
  • Countries: International research collaboration
  • Metrics: Total link strength, cluster membership
Best Practices:
  • Minimum 2-5 documents per author
  • Focus on top 100-500 authors
  • Use normalization for fair comparison

๐Ÿ”— Co-occurrence Networks

  • Author Keywords: Co-occurrence of author-provided keywords
  • Index Keywords: Database-assigned subject terms
  • Title Terms: Terms extracted from publication titles
  • Abstract Terms: Terms from abstracts and full text
Term Processing:
  • Stemming and lemmatization options
  • Stop word removal
  • Manual term cleaning and synonyms

๐Ÿ“š Citation Networks

  • Document Citations: Which papers cite which papers
  • Source Citations: Journal citation patterns
  • Author Citations: Citation relationships between authors
  • Temporal Analysis: Citation evolution over time
Citation Metrics:
  • Citation count and normalized scores
  • Average publication year
  • Average citations per publication

๐Ÿ”„ Bibliographic Coupling

  • Document Coupling: Papers sharing common references
  • Author Coupling: Authors with overlapping reference lists
  • Source Coupling: Journals with similar reference patterns
  • Strength Measure: Number of shared references
Applications:
  • Research front identification
  • Contemporary similarity mapping
  • Subject area delineation

4Visualization and Interpretation

๐ŸŒŸ Network Visualization Options

๐ŸŒ Network View

Purpose: Shows the structure of the network with nodes and connections

  • Node Size: Based on weight (citations, occurrences)
  • Link Thickness: Strength of relationship
  • Colors: Cluster membership or scores
  • Layout: Force-directed algorithm for optimal positioning

๐ŸŽฏ Overlay View

Purpose: Overlays additional information on the network structure

  • Average Publication Year: Temporal evolution visualization
  • Average Citations: Impact visualization
  • Custom Scores: User-defined metrics overlay
  • Color Gradient: Continuous scale representation

๐Ÿ“Š Density View

Purpose: Shows the density of items in different areas of the map

  • Hot Spots: Areas with high concentration of items
  • Smooth Visualization: Continuous density surface
  • Core Areas: Identification of research concentrations
  • Periphery Detection: Emerging or specialized topics

5Advanced Features and Customization

Clustering and Community Detection
VOSviewer Clustering Parameters: - Resolution: Controls cluster size (default 1.0) * Higher values โ†’ smaller clusters * Lower values โ†’ larger clusters - Minimum cluster size: Minimum items per cluster - Clustering algorithm: Modularity-based optimization Manual Cluster Refinement: - Move items between clusters - Merge or split clusters - Custom cluster colors and labels
Network Statistics and Metrics
Available Network Metrics: - Number of items and links - Total link strength - Average cluster coefficient - Network density - Average path length - Centrality measures (per item) Export Options: - Network data (GML, GraphML, Pajek) - Map images (PNG, EPS, SVG) - Item and cluster statistics - Coordinate information

๐ŸŒŸ Real-World Applications

Explore practical applications of scientometrics and informetrics across different domains and research scenarios.

๐Ÿ”ฌ Research Evaluation

Assessment of individual, institutional, and national research performance

๐Ÿ“‹ Science Policy

Evidence-based science and technology policy development

๐Ÿ—บ๏ธ Science Mapping

Visualization and analysis of knowledge domains and research fronts

๐Ÿ”ฎ Trend Prediction

Forecasting emerging research areas and technological developments

๐Ÿ› ๏ธ Tools and Resources

Comprehensive collection of tools, databases, and resources for scientometric analysis.

๐Ÿ’ป Analysis Software

Software tools for bibliometric and scientometric analysis

๐Ÿ“š Databases

Citation databases and data sources

๐Ÿ“– Learning Resources

Books, courses, and educational materials

๐Ÿ“„ Journals & Communities

Academic journals and professional communities

๐Ÿง  Scientometrics Knowledge Assessment

What does the h-index measure?
Total number of publications by an author
Average citations per publication
Balance between productivity and citation impact
Percentage of highly cited papers
Question 1 of 10 | Score: 0