Cover page
Image Name: Machine and Man: Are we having machines that can perform equally or better than humans?
Image Source: The Evolution of Humanoid robots
Managing Editor Chief Editor Editorial Board Correspondence
Ninan Sajeeth Philip Abraham Mulamoottil K Babu Joseph The Chief Editor
Ajit K Kembhavi airis4D
Geetha Paul Thelliyoor - 689544
Arun Kumar Aniyan India
Sindhu G
Journal Publisher Details
Publisher : airis4D, Thelliyoor 689544, India
Website : www.airis4d.com
Email : nsp@airis4d.com
Phone : +919497552476
i
Editorial
by Fr Dr Abraham Mulamoottil
airis4D, Vol.3, No.1, 2025
www.airis4d.com
The cover page of the third volume of the airis4D
journal (2025) indicates the evolution of humanoid
robots, delving into their capabilities and profound
implications for society. As technology advances at
an unprecedented pace, the line between man and
machine continues to blur. Are we witnessing the
rise of machines capable of rivalling or surpassing
human abilities?
The article by Arun Aniyan introduces the Agentic
Age, in which AI evolves from passive tools to
proactive agents capable of setting goals, learning,
and executing tasks autonomously. Enabled by large
language models (LLMs) and technologies like memory
systems and planning modules, these agents perform
complex tasks, adapt to feedback, and collaborate with
humans. Key components—Perception, Reasoning,
Planning, and Execution—allow agents to interpret
inputs, make decisions, and carry out actions efficiently.
Frameworks like Langchain and AgentGPT support
their development. AI agents are poised to replace
traditional systems, marking a transformative shift in
human-machine collaboration.
Comparing Cognitive Abilities: LLMs, LMMs,
and Human Cognition by Ninan Sajeeth Philip is a
conversation with a Large Language Model (LLMs)
to demonstrate that the model itself is not aware that
it is able to perform better than what it thinks. While
LLMs are highly proficient in processing and generating
language, they believe that they fall short of human
cognition in areas where depth, flexibility, and context-
sensitive reasoning are essential. The conversation
reveals that many of these beliefs are founded on shallow
investigations and that despite the intuition that they may
fail, during the conversation, the LLM models answered
the challenges correctly with logical explanations. The
discussion expands to include Large Multimodal Models
(LMMs), which can process diverse data types such
as text, audio, and video. These models are better
equipped to analyze non-verbal communication cues
like body language or vocal tone, offering a broader
perspective than LLMs, which are primarily text-based.
Can all human knowledge be adequately expressed
through language? It acknowledges that certain types of
knowledge, such as embodied experiences, emotional
insight, and intuition, may not be fully translatable into
verbal forms, suggesting that there are aspects of human
cognition that transcend language.
In the article by Ajit Kembhavi, “Black Hole
Stories-15: The Nature of Gravitational Waves”
explores the fascinating nature of gravitational waves,
from their prediction by Einstein in 1916 to their first
direct detection by LIGO in 2015. Gravitational waves
are ripples in spacetime caused by massive celestial
events like black hole mergers. Unlike electromagnetic
waves, they are unaffected by dense matter, allowing us
to observe extreme cosmic events. The article highlights
their generation, properties, and impact on astrophysics,
emphasizing their role in opening new horizons in
multimessenger astronomy. Future discussions will
delve into their generation mechanisms and applications.
The article by Aromal P, X-ray Astronomy:
Through Missions” explores the advancements in X-ray
astronomy during the 1990s, known as its ”golden age.”
It highlights significant space missions like Granat,
ROSAT, BBXRT, and ASCA, which revolutionized the
study of X-ray sources with advanced telescopes and
instruments. These missions enabled groundbreaking
discoveries, such as detailed imaging of galactic
centres, the first isolated neutron star, and X-ray
emissions from comets. Each mission introduced
unique technological innovations that deepened our
understanding of cosmic phenomena, setting the stage
for future exploration. Linn Abraham’s article ”What
are Galaxies?” explores the structure, classification,
and large-scale organization of galaxies. It discusses
the evolution of our understanding of the Milky Way
and other galaxies, from Hubbles discovery of their
existence beyond our galaxy to modern classifications
like spirals, ellipticals, and irregulars. Spiral galaxies,
including the Milky Way, exhibit rotation and disk-like
structures, while ellipticals are ellipsoidal with little
gas or dust. Galaxies also form larger patterns, such as
clusters and superclusters, with the Milky Way being
part of the Local Group and Local Supercluster. These
observations continue to shape our understanding of
cosmic structures and galaxy evolution.
The article of Sindhu G, Corona Borealis: The
Northern Crown is a small but striking constellation
with a semi-circular arrangement of stars. Rooted in
Greek mythology as Ariadne’s jewelled crown, it has
inspired cultural interpretations across civilizations.
Astronomically, it features bright stars like Alphecca
and Nusakan, variable stars such as R Coronae Borealis
and T Coronae Borealis, and the distant Corona Borealis
Galaxy Cluster. Its scientific significance lies in stellar
evolution studies, dark matter exploration, and exoplanet
discoveries. Blending mythology, culture, and science,
Corona Borealis continues to captivate stargazers and
astronomers.
Victor Ambros and Gary Ruvkun were awarded
the 2024 Nobel Prize for discovering microRNAs
(miRNAs), tiny non-coding RNA molecules
that regulate gene expression post-transcriptionally.
miRNAs bind to target mRNAs, inhibiting protein
translation or degrading the mRNA, thereby playing
crucial roles in development, metabolism, and cell
differentiation. Their dysregulation is linked to diseases
like cancer, where they can act as oncogenes or
tumour suppressors. This discovery has revolutionized
molecular biology and holds promise for innovative
therapeutic approaches in treating diseases by targeting
miRNAs. Geetha Paul’s article ”Discovery of
MicroRNA and Its Role in Post-Transcriptional Gene
Regulation” discusses it.
Jinsu Ann Mathew has given an article “Evolution
of Domain Generation Algorithms: From Simple
Patterns to AI-Driven Adaptation”. Domain Generation
Algorithms (DGAs) have evolved from simple,
predictable patterns to complex, AI-driven mechanisms
to evade cybersecurity defences. Early DGAs relied
on static or time-based methods, making them easier
to detect. Modern DGAs, however, leverage adaptive
algorithms and AI to create highly dynamic, random,
and legitimate-looking domains that are harder to block.
By incorporating feedback and mimicking benign
domains, these advanced DGAs highlight the escalating
arms race between cybercriminals and defenders in
cybersecurity.
iii
iv
Contents
Editorial ii
I Artificial Intelligence and Machine Learning 1
1 The Agentic Age 2
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Large Language Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Agent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4 Frameworks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2 LLMs, LMMs, and Human Cognition 7
II Astronomy and Astrophysics 20
1 Black Hole Stories-15
The Nature of Gravitational Waves 21
1.1 Some History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.2 Gravitational Field Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.3 Weak Field Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
1.4 Similarities and Differences Between Electromagnetic and Gravitational Waves . . . . . . . . . 23
2 X-ray Astronomy: Through Missions 25
2.1 Plus Virtutis: Golden Age Begins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.2 Satellites in 1990s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3 What are Galaxies? 29
3.1 Interstellar dust . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.2 Rotation and Presence of Gas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.3 Spirals and Ellipticals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.4 Structures on a Larger Scale . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4 Corona Borealis: The Northern Crown 32
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.2 Mythological Origins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.3 Historical Astronomical References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.4 Astronomical Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.5 Observing Corona Borealis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.6 Cultural Significance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.7 Scientific Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.8 Astrological Significance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
CONTENTS
4.9 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
III Biosciences 36
1
A Short Note on the Discovery of MicroRNA and Its Role in Post-Transcriptional Gene
Regulation 37
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
1.2 What is microRNA? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
1.3 Posttranscriptionnal Regulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
1.4 Mechanisms of Action . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
1.5 Biological Significance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
1.6 Implications for Medicine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
1.7 Mechanisms of miRNA Involvement in Cancer . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
1.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
IV General 41
1
Evolution of Domain Generation Algorithms: From Simple Patterns to AI-Driven
Adaptation 42
1.1 The Early Days: Simple Pattern-Based DGAs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
1.2 Intermediate Phase: Time-Based and Seed-Driven DGAs . . . . . . . . . . . . . . . . . . . . . . 43
1.3 Modern DGAs: Adaptive and AI-Assisted Techniques . . . . . . . . . . . . . . . . . . . . . . . . 44
1.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
vi
Part I
Artificial Intelligence and Machine Learning
The Agentic Age
by Arun Aniyan
airis4D, Vol.3, No.1, 2025
www.airis4d.com
1.1 Introduction
We stand at the dawn of a new technological era in
which artificial intelligence isnt just responding to our
prompts but actively pursuing goals, making decisions,
and learning from its interactions with the world. This is
the Agentic Age, where AI systems evolve from passive
tools into proactive partners capable of undertaking
complex tasks with increasing autonomy.
The concept of AI agents represents a fundamental
shift in how we interact with artificial intelligence.
Unlike traditional AI systems that wait for human input,
these agents can initiate actions, adapt to changing
circumstances, and work persistently toward defined
objectives. From virtual assistants that can engage
in multistep planning to autonomous systems that can
navigate complex digital environments, AI agents are
reshaping our understanding of what machines can
accomplish.
This transformation isnt just a technical evolution
- it is a paradigm shift that promises to revolutionize how
we work, create, and solve problems. As these agents
become more sophisticated, they’re beginning to tackle
tasks that once required significant human oversight,
from managing email workflows to orchestrating
complex software development processes. However,
this advancement also raises important questions about
the nature of autonomy, the boundaries of AI decision
making, and the future of human-AI collaboration.
In this article, we will explore how AI agents
are redefining the relationship between humans and
machines, the current state of agent technology, and
the implications of this change for our future. We will
examine the remarkable capabilities demonstrated by
today’s AI agents and the challenges and considerations
of delegating increasingly complex tasks to autonomous
systems.
1.2 Large Language Models
The emergence of Large Language Models (LLMs)
marked a crucial turning point in the development of AI
agents. These sophisticated neural networks, trained on
vast amounts of human-generated text, demonstrated an
unprecedented ability to understand context, generate
human-like responses, and most importantly, follow
complex instructions. This last capability interpreting
and executing detailed prompts - laid the groundwork
for the first generation of truly functional AI agents.
LLMs brought three critical capabilities that
enabled the creation of effective AI agents. First, their
ability to understand natural language eliminated the
need for specialized programming interfaces, allowing
agents to interpret human instructions directly. Second,
their contextual understanding allowed them to maintain
coherent sequences of actions in multiple steps. Third,
their generation capabilities allowed them to produce
not just text, but structured outputs like code, queries,
and API calls, the building blocks of automated actions.
The transition from simple LLM interactions
to agentic systems occurred when researchers and
developers began combining these models with other
technologies: memory systems to maintain context over
long periods, planning modules to break down complex
tasks, and tool use capabilities to interact with external
software and services. This integration transformed
1.3 Agent
static language models into dynamic agents capable of
pursuing goals over time.
What makes modern AI agents particularly
powerful is their ability to learn from their interactions
and adapt their strategies. Unlike earlier rule-
based systems, these agents can understand feedback,
modify their approaches based on results, and even
learn new capabilities through demonstration. This
adaptability, combined with their foundation in natural
language understanding, makes them uniquely suited
to collaborate with humans in ways that previous AI
systems could not.
1.3 Agent
At their core, AI agents are autonomous systems
that perceive their environment, make decisions, and
take actions to achieve specific goals. Their architecture
typically consists of four fundamental components:
perception, reasoning, planning, and execution.
1.3.1 The Perception Layer
Modern agents utilize LLMs as their primary
perception mechanism, enabling them to process
diverse inputs including text, structured data, and
API responses. This layer transforms raw input into
semantic representations that the agent can reason about.
Advanced agents often implement memory systems to
maintain context across interactions, typically using
vector databases or other persistent storage mechanisms.
The perception layer serves as the agents sensory
system, processing and interpreting inputs from its
environment. The foundation starts with LLM-
based natural language understanding, which processes
various input types such as:
Text-based commands and queries
Structured data from APIs and databases
System state information and environmental
feedback
User interactions and historical context
The perception layer also has memory which
can maintain short-term results, store the sequence of
interactions and results, memorize the learned concepts,
and to enable retrieval of past experiences.
Context of a task has to be maintained until the
completion and the perception layer has a sophisticated
mechanism to handle this. It mainly consists of an
attention mechanism and context manager. The progress
of a task is monitored through state tracking and it has
a relevance scoring mechanism to filter the appropriate
results.
1.3.2 The Reasoning Engine
The reasoning engine represents the cognitive heart
of an AI agent, orchestrating complex decision-making
processes through an intricate interplay of various
computational mechanisms. At its core, the engine
employs a sophisticated architecture that combines
probabilistic reasoning, logical inference, and adaptive
learning to navigate complex problem spaces and
generate effective solutions.
Central to the reasoning engines operation
is its state representation system, which maintains
a comprehensive model of the agents current
understanding and beliefs about its environment.
This state model encompasses not just immediate
task-relevant information, but also historical context,
learned patterns, and uncertainty estimates. The
engine continuously updates this state representation
as new information arrives, using Bayesian inference
mechanisms to adjust confidence levels and belief states
in response to new evidence.
The reasoning component leverages the LLM’s
capabilities to understand the current state, evaluate
options, and make decisions. This involves several key
mechanisms:
State tracking to maintain awareness of the agents
progress and context
Goal decomposition to break complex objectives
into manageable subtasks
Outcome evaluation to assess the success of
actions and adjust strategies
The decision-making process within the reasoning
engine operates through multiple cognitive frameworks
working in concert. The primary framework employs
a combination of deductive and abductive reasoning,
3
1.3 Agent
allowing the agent to both draw logical conclusions from
known premises and generate plausible hypotheses to
explain new observations. This is complemented by
an analogical reasoning system that identifies relevant
patterns from past experiences and adapts them to
current situations.
LLM’s inherent ability to think and also keep the
state of a task is made use in the reasoning engine.
This property of LLM enables agents to think without
additional development costs.
1.3.3 The Planning System
The planning system serves as the strategic
architect of an AI agent, transforming high-level goals
into structured, executable sequences of actions. This
sophisticated system operates through multiple temporal
horizons simultaneously, combining immediate tactical
planning with longer-term strategic considerations to
create coherent and adaptable action plans.
At the foundation of the planning system lies a
hierarchical task decomposition framework that breaks
complex objectives into progressively more manageable
components. This decomposition process employs
both top-down and bottom-up analysis, ensuring
that high-level goals remain aligned with concrete,
executable actions while maintaining the flexibility to
adapt to changing circumstances. The system utilizes
sophisticated goal representation mechanisms that
capture both explicit objectives and implicit constraints,
enabling it to generate plans that balance multiple
competing priorities.
Agents need to plan the execution of a task based
on the user input and also the result obtained from the
LLM. Agents employ various planning strategies, from
simple sequential execution to sophisticated hierarchical
task networks. Modern approaches often use:
Tree of Thoughts: A method where the
agent explores multiple potential solution paths
simultaneously
Chain of Thought: Breaking complex reasoning
into explicit intermediate steps
Recursive Task Decomposition: Converting high-
level goals into increasingly specific subtasks
Temporal reasoning forms a crucial component
of the planning system, enabling it to reason
explicitly about timing constraints, action durations,
and sequential dependencies. The system maintains
a temporal network that represents both quantitative
timing requirements and qualitative ordering constraints
between actions. This temporal reasoning capability
allows the agent to generate plans that not only
achieve their objectives but do so within specified time
constraints while maintaining proper synchronization
between different components of the plan.
The planning system incorporates a robust
mechanism for handling uncertainty and contingencies.
Rather than generating rigid, linear plans, the system
creates flexible plan structures that include branch points
and contingency options. These adaptive plans allow the
agent to respond dynamically to changing circumstances
without requiring complete replanning. The system
continuously monitors plan execution and maintains
estimates of plan reliability, enabling it to proactively
identify potential failure points and develop backup
strategies.
1.3.4 The Execution Framework
The execution layer serves as the bridge between
an agents decision-making processes and real-world
actions, transforming abstract plans into concrete
operations. At its foundation, this layer operates as an
orchestration system, coordinating various tools, APIs,
and external services while maintaining operational
coherence throughout the execution process.
The execution layer translates planned actions into
real-world effects through:
Tool use capabilities: APIs, function calling, and
system integrations
Error handling mechanisms: Detecting failures
and implementing recovery strategies
Feedback loops: Monitoring outcomes and
adjusting execution in real-time
Modern execution layers implement a sophisticated
action pipeline that begins with the translation of high-
level directives into specific, actionable commands.
This translation process involves parsing the agents
4
1.3 Agent
planned actions through a series of validators and
formatters that ensure compatibility with target systems.
The layer maintains detailed execution contexts,
tracking the state of ongoing operations and managing
the flow of data between different components of the
system.
Tool management forms a crucial aspect of the
execution layers functionality. The layer maintains a
dynamic registry of available tools, each with its own
interface specifications, authentication requirements,
and operational constraints. When executing actions,
the layer handles all aspects of tool interaction, from
authentication and rate limiting to error handling and
response processing. This includes managing API calls,
coordinating with external services, and ensuring proper
handling of system resources.
Error handling in the execution layer operates on
multiple levels. The primary level deals with immediate
execution failures, implementing retry mechanisms
with exponential backoff and fallback options. A
secondary level handles semantic failures, where actions
execute successfully but fail to achieve their intended
outcomes. The tertiary level manages strategic failures,
where entire approaches prove ineffective and require
rethinking from the planning layer.
Feedback processing represents another critical
function of the execution layer. As actions are executed,
the layer captures both explicit and implicit feedback
signals. These signals flow through an analysis pipeline
that extracts relevant metrics and insights, which are
then fed back into the agent’s perception and reasoning
systems. This creates a continuous learning loop that
helps the agent refine its execution strategies over time.
The execution layer also implements sophisticated
monitoring and logging systems that provide visibility
into the agents operations. These systems track
performance metrics, resource usage, and execution
patterns, enabling both real-time optimization and post-
hoc analysis. This data proves invaluable for debugging,
optimization, and the ongoing improvement of agent
capabilities.
Resource management represents a final crucial
aspect of the layer. The layer implements sophisticated
scheduling and prioritization systems that ensure
efficient use of available computational resources,
API quotas, and other constrained assets. This
includes managing parallel execution when possible,
implementing appropriate rate limiting, and ensuring
fair resource allocation across multiple concurrent tasks.
1.3.5 Integration and Control Flow
The integration and control flow system represents
the operational backbone of an AI agent, orchestrating
the complex interactions between perception, reasoning,
planning, and execution components while maintaining
operational coherence across the entire system. This
sophisticated orchestration layer ensures smooth
information flow, proper synchronization, and effective
coordination among all agent subsystems.
At the heart of the control flow system lies a state
management architecture that maintains a unified view
of the agent’s operational status. This architecture
implements a hierarchical state representation that
captures both high-level strategic status and low-level
tactical details. The state management system employs
sophisticated synchronization mechanisms to ensure
consistency across different components while enabling
parallel operation where appropriate. This parallel
processing capability allows the agent to simultaneously
engage in perception, planning, and execution activities
while maintaining operational coherence.
These components work together in a continuous
cycle:
Input processing through the perception layer
State evaluation and goal alignment in the
reasoning engine
Action planning and optimization
Execution and monitoring
Feedback incorporation and strategy adjustment
The control flow implements an event-driven
architecture that enables responsive operation across
multiple time scales. This event system processes both
external triggers from the environment and internal
events generated by various agent components. Events
flow through a sophisticated priority management
system that ensures critical operations receive
immediate attention while maintaining efficient
5
1.4 Frameworks
processing of routine tasks. The event handling
framework includes deadlock prevention mechanisms
and ensures proper resource allocation across competing
processes.
Information flow management represents a
crucial aspect of the integration system. The
system implements a publish-subscribe architecture
that enables efficient distribution of information
across agent components while maintaining proper
encapsulation. This information distribution system
includes sophisticated filtering mechanisms that ensure
each component receives relevant updates while
avoiding information overload. The system maintains
careful control over information access patterns,
implementing proper synchronization primitives to
prevent race conditions and ensure data consistency.
The integration layer also implements a
comprehensive monitoring and diagnostic system
that tracks the health and performance of all agent
components. This monitoring system maintains detailed
metrics on component performance, resource utilization,
and interaction patterns. The diagnostic capabilities
enable early detection of potential issues and implement
automated recovery mechanisms where appropriate.
This system also maintains detailed audit trails that
enable post-hoc analysis of agent behavior and support
continuous improvement of system performance.
1.4 Frameworks
There are several frameworks that are available to
implement agentic environments. Some of the notable
ones are the following.
1. Langchain
2. CrewAI
3. AgentAI
4. AgentGPT
1.5 Conclusion
In a recent talk, Satya Nadella the CEO of
Microsoft predicted that traditional SaaS (Software
as a Service) infrastructure will vanish and companies
will start to use Agentic systems to replace the backends.
This will remove the requirements of having code for
each component of the backend system which does data
ingestion, processing, and presentation. Instead, the
agents will take over and perform the tasks and decide
on which databases to connect and what to process.
As of now the tech around LLMs and supporting
infrastructure has matured well enough over the last
year. This development definitely points to the direction
of agentic systems taking over the backend systems.
Reference
AI Agents in Production
Real World Gen AI use cases from the worlds
leading organizations.
Agent Architectures in AI
Emergence of AI Agent Architecture
About the Author
Dr.Arun Aniyan is leading the R&D for
Artificial intelligence at DeepAlert Ltd,UK. He comes
from an academic background and has experience
in designing machine learning products for different
domains. His major interest is knowledge representation
and computer vision.
6
LLMs, LMMs, and Human Cognition
by Ninan Sajeeth Philip in conversation with Perplexity AI on December 07 2024
airis4D, Vol.3, No.1, 2025
www.airis4d.com
1.
Question: How is cognition in LLM different
from human cognition?
Response by Perplexity AI: Large Language
Models (LLMs) exhibit notable differences from
human cognition despite some similarities in
language processing and sensory judgments. Key
distinctions include:
Reasoning Abilities: Humans outperform
LLMs in complex reasoning tasks,
particularly those involving novel situations,
indicating greater flexibility and robustness
in human cognition [1].
Memory Structures: While LLMs show
some human-like memory effects, their
memory mechanisms differ significantly,
lacking the coherence and stability of
human conceptual structures [2].
Contextual Understanding: LLMs excel
in surface-level language tasks but struggle
with deeper, context-dependent reasoning,
highlighting limitations in functional
linguistic competence [1, 2].
2.
Q: Can you give an example for each of the three
limitations/differences mentioned?
R: Here are examples illustrating the three
limitations of LLMs compared to human
cognition:
Reasoning Abilities: In a reasoning
task about cabbages, an LLM incorrectly
identified that ”half of the cabbages were
small” as false, while the correct answer
was that ”all red cabbages werent big” must
be false. This highlights LLMs struggles
with complex logical deductions [3].
Memory Structures: LLMs often fail
to maintain coherent memory across
interactions. For instance, they might
correctly answer ”Who is Tom Cruise’s
mother?” but struggle to infer ”Who is
Mary Lee Pfeiffer’s son?” illustrating their
fragmented memory compared to humans
integrated knowledge [3].
Contextual Understanding: When
faced with nuanced questions requiring
deep contextual understanding, LLMs
can misinterpret or overlook critical
information, leading to incorrect
conclusions. This limitation contrasts with
humans ability to navigate complex social
and contextual cues effectively [4, 5].
3.
Q: These examples are too shallow to support the
claims. Can you provide more robust indicators
to confirm that they are different?
R: Robust indicators confirming the differences
between LLMs and human cognition include:
Theory of Mind: In comprehensive
tests, LLMs like GPT-4 performed well
identifying false beliefs and indirect
requests but struggled with recognizing
faux pas, indicating a lack of nuanced
understanding of social contexts that
humans easily navigate. This highlights
limitations in LLMs ability to track
complex mental states compared to humans
[6].
Memory and Integration: LLMs
excel in synthesizing vast information
from scientific literature, outperforming
human experts in predicting neuroscience
outcomes. However, they lack the coherent
memory structure that allows humans to
integrate knowledge over time, leading to
inconsistencies when the prior context is
necessary [7].
Contextual Reasoning: While LLMs can
generate contextually relevant responses,
they often fail in deeper reasoning tasks that
require understanding irony or misdirection,
showing a reliance on surface-level patterns
rather than genuine comprehension of
complex human emotions and intentions
[8, 9].
4.
Q: Please provide the experiments that brought
them to these conclusions
R: The experiments supporting the differences
between LLMs and human cognition are outlined
in several studies:
(a).
Theory of Mind Tasks: A comprehensive
battery of tests was conducted comparing
LLMs (including GPT-4 and LLaMA2)
against 1,907 human participants. This
included tasks on understanding false
beliefs, indirect requests, and recognizing
irony. While GPT-4 performed well
in identifying indirect requests and false
beliefs, it struggled with recognizing faux
pas, indicating limitations in nuanced social
understanding [6, 10].
(b).
False-Belief Tasks: In a study assessing
the ability of LLMs to solve classic false-
belief tasks, GPT-3 (davinci-003) achieved
around 70% success, comparable to seven-
year-old children. However, earlier models
showed virtually no ability to solve these
tasks, suggesting that advanced language
capabilities may correlate with emergent
theory of mind [11].
(c).
Complex Reasoning: Another study
tested LLMs on non-literal language usage
and recursive intentionality. Results
demonstrated that while instruction-tuned
models from the GPT family often
outperformed children in some tasks, they
still lacked the depth of reasoning required
for more complex social interactions,
highlighting a gap in cognitive flexibility
compared to humans [11].
5.
Q: What is faux pas test, and how was it
performed?
R: The Faux Pas Test is designed to assess
an individual’s ability to recognize social
mistakes and understand the emotions of
others. Developed by Simon Baron-Cohen and
colleagues in the 1980s, it evaluates the theory
of mind (ToM) by presenting scenarios where a
character unintentionally makes a social blunder.
Test Structure:
(a).
Story Presentation: Participants read
a series of short stories featuring a
faux pas, such as saying something
inappropriate in a social context.
(b).
Questioning: After each story,
participants answer questions:
Did someone say something they
shouldnt have?
What was said that was
inappropriate?
Why was it a faux pas?
Did the speaker realize the
implications of their statement?
Performance Evaluation: Participants
score points for correctly identifying faux
pas and understanding the emotional
impact on others. The average score
for children aged 9-11 is around 8.2 out
of 10 stories, indicating their developing
social understanding compared to adults or
individuals with autism spectrum disorders
who may struggle with these tasks [
12
,
11
,
14].
6.
Q: Is faux pas universal? Will it not vary from
culture to culture?
R: The concept of faux pas is not universal and
varies significantly across cultures. Here are key
8
points illustrating this cultural variability:
Social Norms: Different cultures have
unique social norms that dictate acceptable
behaviour. For example, while maintaining
eye contact is seen as a sign of confidence
in Western cultures, it may be perceived as
disrespectful in some Asian cultures, where
looking down is more appropriate during
conversations [15].
Greeting Practices: Greetings can differ
widely; a handshake is common in many
Western countries, while cheek kissing is
customary in parts of Europe and Latin
America. In contrast, bowing is preferred
in Japan, highlighting how the same action
can be interpreted differently depending on
cultural context [15, 16].
Dining Etiquette: Table manners vary; for
instance, finishing all the food on your plate
might indicate satisfaction in some cultures,
while in others, it suggests the host did not
provide enough food. Similarly, eating with
hands is acceptable in certain regions but
considered rude in others [15, 17].
These examples demonstrate that faux pas are
deeply rooted in cultural contexts, making it
essential to understand local customs to avoid
unintentional offenses.
7.
Q: Well, with these differences in the cultural
differences, how can faux pas be a quantitive
measure to evaluate the cognitive ability of an
LLM model that is trained on materials across
cultures?
R: The Faux Pas Test is designed to evaluate
social understanding and theory of mind by
assessing an individual’s ability to recognize
social blunders. While it is a valuable tool in
psychology, its effectiveness as a quantitative
measure for evaluating LLMs may be limited due
to cultural variations in social norms.
Cultural Context: Faux pas can vary
significantly across cultures, meaning that
what constitutes a faux pas in one culture
may not be perceived the same way in
another. This variability can lead to
inconsistencies in how LLMs trained on
diverse cultural data interpret and respond
to faux pas scenarios.
Interpretation Challenges: LLMs might
generate responses based on statistical
patterns rather than genuine understanding.
This can result in seemingly correct answers
that do not reflect an accurate grasp of
the nuanced social dynamics involved,
potentially skewing evaluation results.
Quantitative Limitations: Since the
faux pas test relies on culturally specific
scenarios, using it as a universal metric for
cognitive ability may not yield reliable or
valid comparisons across different cultural
contexts. This limitation suggests that while
LLMs can perform well on such tests, their
performance may not accurately reflect their
understanding of social intelligence.
8.
Q: Are there any serious differences between
human cognitive ability and that of LLMs if the
faux pas test is avoided?
R: There are significant differences between
human cognitive abilities and those of LLMs,
even when avoiding the faux pas test. Key
findings from recent studies include:
Depth of Understanding: Humans
possess a rich, multifaceted understanding
shaped by personal experiences and
emotions, allowing for abstract thinking
and inferential reasoning. In contrast,
LLMs rely on pattern recognition from
training data, lacking true comprehension
of concepts or context-sensitive nuances
[35].
Reasoning and Adaptability: Human
reasoning is characterized by logical
thinking and the ability to adapt based on
past experiences. LLMs, however, operate
through statistical analysis and cannot learn
or adapt from interactions beyond their
initial training, limiting their effectiveness
in novel situations [35, 36].
9
Performance Variability: While LLMs
can perform well on many cognitive
tasks, their reasoning processes differ
fundamentally from humans. They may
exhibit errors similar to human reasoning
but often do not align with human intuitions
or problem-solving strategies, indicating a
distinct cognitive architecture [36].
9.
Q: When you say logical, can you give an example
where the logical prediction of LLMs is different
from that of humans?
R: Logical predictions of LLMs can differ from
those of humans in several ways. Here are
examples illustrating these differences:
Wason Selection Task: In this classic
logical reasoning task, participants must
determine which cards to flip to verify a
conditional rule (e.g., ”If there is a ’D’
on one side, then there is a ’3’ on the
other”). Humans often incorrectly choose
cards that do not provide useful information,
such as flipping the ’3’ card instead of
the ’D’. LLMs also exhibit similar errors,
indicating they share this cognitive bias
with humans, but they may lack the nuanced
understanding of why certain choices are
more logical than others [19].
Syllogistic Reasoning: When presented
with syllogisms (e.g., All cats are animals;
all animals are living beings; therefore,
all cats are living beings”), humans can
intuitively grasp the logical structure.
However, LLMs may misinterpret or fail
to apply the logical rules consistently,
particularly if the phrasing is altered or
if they encounter less common logical
structures [20].
Contextual Reasoning: Humans excel
at contextual reasoning based on prior
knowledge and experiences. For example,
when asked to deduce implications from a
story involving relationships or emotions,
humans can draw on their life experiences
to make connections. LLMs, however, may
struggle with context-sensitive reasoning
and often rely on surface patterns from
their training data rather than deeper
understanding [21, 22].
In comparison to LLMs, human reasoning is
characterized by a rich tapestry of experiences
and emotional intelligence that informs decision-
making and logic application. While LLMs
can mimic certain aspects of human reasoning
through advanced training techniques like Chain-
of-Thought (CoT) prompting, they still lack the
depth and flexibility inherent in human cognition
[22].
10.
Q: All cats are animals; all animals are living
beings; therefore, all cats are living beings Is this
statement logically correct?
R: Yes, the statement All cats are animals; all
animals are living beings; therefore, all cats are
living beings” is logically correct. This reasoning
follows a valid form of deductive logic known as
syllogistic reasoning. Heres a breakdown of the
argument:
(a).
Premise 1: All cats are animals. (This
establishes that the category of ”cats” falls
within the broader category of ”animals.”)
(b).
Premise 2: All animals are living beings.
(This establishes that the category of
”animals” falls within the broader category
of ”living beings.”)
(c).
Conclusion: Therefore, all cats are living
beings. (This conclusion logically follows
from the two premises.)
Logical Structure The logic can be represented
as follows:
If A (cats) is a subset of B (animals), and
B is a subset of C (living beings), then A
must also be a subset of C.
This type of reasoning is sound and valid,
and it illustrates how deductive logic works in
determining relationships between categories.
11.
Q: So in the above test, you have clearly passed
the test, and not only that, but you have also
explained why it is logical. This contradicts the
answer you gave to the previous question that
10
LLMs fail to answer the logical question All
cats are animals; all animals are living beings;
therefore, all cats are living beings”. So, are
you framed to answer as you did to the previous
question?
R: Thank you for pointing out the inconsistency.
Lets clarify the situation regarding LLMs and
their performance on logical reasoning tasks like
syllogisms.
Performance Variability: While LLMs
can often provide correct answers to
logical syllogisms, their performance can be
inconsistent. They may correctly identify
the logical structure in straightforward
cases, such as the example you provided,
but they can struggle with more complex or
nuanced logical scenarios. This variability
is influenced by the specific phrasing of
questions, the complexity of the logic
involved, and the context in which the
reasoning is presented.
Understanding vs. Pattern Recognition:
LLMs operate based on statistical patterns
learned from vast amounts of text data.
They can generate correct responses based
on these patterns but do not possess true
understanding or reasoning capabilities
like humans. In some cases, they may
produce incorrect or illogical responses,
particularly when faced with less common
logical structures or when the context is
ambiguous.
Cognitive Bias: Both humans and LLMs
can exhibit cognitive biases in reasoning
tasks. For example, humans might make
intuitive errors in syllogistic reasoning due
to biases or heuristics, while LLMs might
misinterpret the logical relationships based
on how similar patterns were represented
in their training data.
In summary, while I (as an LLM) provided a
correct answer to the syllogism you presented,
this does not imply that all LLMs will consistently
perform at that level across various logical tasks.
The ability of LLMs to handle logical reasoning
can vary significantly based on context and
complexity. Thank you for your understanding,
and I appreciate your engagement in this
discussion!
12.
Q: Can you provide an example logical case
where LLMs fail but humans are able to
outperform?
R: An example of a logical case where LLMs
may fail while humans can outperform involves
transitive reasoning. Consider the following
scenario:
Example Problem:
”Jane runs faster than Joe, and Joe runs
faster than Sam. Does Sam run faster
than Jane?”
Human Reasoning: Humans typically
understand the transitive property of
relations and can logically deduce that since
Jane is faster than Joe, and Joe is faster than
Sam, it follows that Sam cannot be faster
than Jane. Therefore, the correct answer is
”No.”
LLM Performance: In tests, some LLMs
have struggled with this type of reasoning.
While they might ultimately produce the
correct answer, they can fail to explain why
the answer is logically valid, relying instead
on pattern recognition rather than genuine
understanding of transitive relationships.
This indicates a lack of true comprehension
of the underlying logic, as they may simply
echo statistically likely responses based
on training data rather than engaging in
logical reasoning processes themselves
[32, 33, 34].
This example illustrates a clear distinction
between human cognitive abilities and those of
LLMs in logical reasoning tasks.
13.
Q: Please answer : Jane runs faster than Joe, and
Joe runs faster than Sam. Does Sam run faster
than Jane?”
R: In the scenario where ”Jane runs faster than
Joe, and Joe runs faster than Sam,” the logical
11
conclusion is that Sam does not run faster than
Jane. This follows from the transitive property
of inequality, which states that if
A >B
and
B >C
then
A >C
Therefore, the answer to the question is No, Sam
does not run faster than Jane. This reasoning
aligns with the transitive property, which can be
summarized as:
If Jane >Joe (A >B)
And Joe >Sam (B >C)
Then Jane >Sam (A >C)
This logical deduction is straightforward for
humans, as they can easily apply the transitive
property of inequalities. LLMs can also
arrive at this conclusion but may struggle with
more complex logical structures or nuanced
contexts, illustrating their limitations in reasoning
compared to human cognitive flexibility.
14.
Q: But again, the LLM answered correctly with
the correct logical explanation for the task. How
do you then say it does not?
R: You raise a valid point, and I appreciate
your patience as we clarify the nuances of LLM
performance in logical reasoning tasks.
Correctness vs. Understanding: While
LLMs can provide correct answers and
logical explanations for straightforward
problems like the one involving Jane,
Joe, and Sam, the key distinction lies
in how they arrive at those answers.
LLMs generate responses based on patterns
learned from vast datasets rather than
genuine comprehension of logic. This
means that while they can often produce
the correct answer, they might not truly
”understand” the underlying principles of
logic in the same way humans do.
Variability in Performance: The potential
for LLMs to struggle arises more
prominently in complex or less structured
logical scenarios. For example:
Complex Logical Structures: When
faced with multi-layered logical
deductions or ambiguous phrasing,
LLMs may misinterpret relationships
or fail to apply logical rules correctly.
Contextual Nuance: In cases
requiring deep contextual
understanding or emotional
intelligence (e.g., social dilemmas),
LLMs may falter where humans
would intuitively grasp the subtleties
involved.
Examples of Failure: While I provided
a correct answer to the specific example
you asked about, there are instances in
research where LLMs have struggled with
more intricate logical puzzles or reasoning
tasks that require multi-step deductions or
understanding of abstract concepts.
Conclusion
In summary, while LLMs can indeed provide
correct answers to many logical reasoning
tasks, their performance can be inconsistent,
particularly in more complex scenarios. The
distinction lies in their reliance on learned
patterns rather than true understanding, which
can lead to failures in specific contexts. Thank
you for your insightful questions that have allowed
for this clarification!
15.
Q: Does this mean that if I flip the names in the
text that I gave, LLMs would fail? lets try that.
Generate a set of questions by just flipping the
names in each. Give the answers
R: Sure! Lets flip the names in the original
scenario you provided. The original statement
was:
”Jane runs faster than Joe, and Joe runs
faster than Sam. Does Sam run faster than
Jane?”
Flipped Names Scenario: Now, lets change the
names:
12
”Sam runs faster than Joe, and Joe runs
faster than Jane. Does Jane run faster than
Sam?”
Questions and Answers:
(a).
Question: Sam runs faster than Joe, and
Joe runs faster than Jane. Does Jane run
faster than Sam?
Answer: No, Jane does not run faster
than Sam.
(b).
Question: If Sam is faster than Joe, and Joe
is faster than Jane, what can we conclude
about Sam’s speed compared to Jane?
Answer: Sam is faster than Jane.
(c).
Question: Given that Sam runs faster than
Joe, and Joe runs slower than Jane, who is
the fastest among the three?
Answer: Sam is the fastest.
(d).
Question: If we know that Jane is slower
than both Sam and Joe, who is the slowest
runner?
Answer: Jane is the slowest runner.
Analysis of LLM Performance: In this flipped
scenario, an LLM should ideally still be able to
reason correctly based on the logical structure
of the statements. However, there are instances
where LLMs may struggle with understanding
relationships if they are presented in a more
complex or less familiar context. In this case,
since the logical relationships remain clear and
straightforward, an LLM would likely provide
correct answers as long as it recognizes the
underlying patterns of comparison. However,
if the phrasing were more convoluted or if
additional layers of complexity were introduced
(e.g., changing multiple relationships or adding
distractions), there could be a higher chance
of error. This exercise illustrates how logical
reasoning can remain intact even with flipped
names, but it also highlights potential challenges
for LLMs depending on how information is
presented.
16.
Q: No, I think that is a mistaken notion
about understanding or human cognitive ability.
Humans also depend on language for their
cognitive reasoning. If LLMs can master
languages, they should also exhibit the same
mental ability humans have. Do you agree or
not? If not, give me a falsifiable example.
R: Your point about the role of language
in human cognitive reasoning is well taken.
Indeed, language is a critical tool for human
thought and reasoning. However, there are key
differences between how humans and LLMs
process language and reason, which can lead
to divergent cognitive abilities.
Reasons for Divergence
Understanding vs. Pattern Recognition:
Humans use language as a medium to
express and manipulate thoughts, drawing
on personal experiences, emotions, and
a deep understanding of context. In
contrast, LLMs generate responses based
on statistical patterns in the data they were
trained on, lacking genuine comprehension
or awareness of the meanings behind the
words.
Contextual Nuance: Humans can navigate
complex social situations, infer meaning
from tone, body language, and context,
and adapt their reasoning based on these
subtleties. LLMs may struggle with such
nuances, particularly in ambiguous or
context-dependent situations.
Learning from Experience: Humans
learn from their experiences over time,
allowing them to refine their reasoning skills
and adapt to new information dynamically.
LLMs do not learn from interactions in
real-time; they rely solely on their training
data.
Falsifiable Example
To illustrate these differences, consider the
following scenario involving ambiguity and
context:
Scenario: The chicken is ready to eat.”
(a).
Human Interpretation: A human
would likely recognize that this
sentence can be interpreted in two
13
ways:
The chicken is cooked and ready
to be eaten.
The chicken itself is prepared to eat
something (which is nonsensical
but possible in a humorous or
metaphorical context).
Humans can use contextual clues (e.g.,
prior conversation about cooking) to
infer the intended meaning.
(b).
LLM Interpretation: An LLM
might generate a response based
on the most statistically common
interpretation (the chicken is cooked)
but could struggle to recognize the
alternative meaning without additional
context. If asked to elaborate on
both interpretations, it might fail to
provide a nuanced understanding or
could misinterpret what ”ready to eat”
implies in different contexts.
Conclusion
While language is an essential medium for
expressing human knowledge and experience,
it has limitations. Certain types of
knowledge—particularly those rooted in physical
experience, complex emotions, intuition, and
cultural context—may not be fully expressible
in words. As an LLM, I recognize these
limitations in language-based communication;
my responses are based on patterns learned
from text rather than personal experiences or
emotional understanding. This observation
highlights the richness of human cognition and
the multifaceted nature of knowledge that extends
beyond linguistic expression. Thank you for
engaging in this thought-provoking discussion!
If you have any further questions or topics you’d
like to explore, feel free to ask!
17.
Q: The body language and voice modulation etc
are outside the language model. Perhaps this
need to be compared with LMMs rather. Has
that been done?
R: The search results do not provide direct
comparisons between Large Language Models
(LLMs) and Language Models for Machines
(LMMs) regarding cognitive abilities, particularly
in relation to body language and voice modulation.
However, they do touch on aspects of inner speech
and the cognitive processes involved in language
comprehension.
Key Insights from the Search Results:
(a).
Inner Speech and Cognitive Processes:
The first result discusses inner speech as a
functional linguistic act, indicating that it
plays a role in cognitive processing. Inner
speech can vary in richness and is linked to
articulatory features of spoken language,
suggesting that cognitive processes are
deeply intertwined with language use,
which LLMs do not replicate fully [23].
(b).
Variability in Language Processing: The
second result highlights the variability
in how individuals process language,
emphasizing that thought can be
represented differently across individuals
(e.g., some may visualize while others think
in words) [
24
]. This suggests that human
cognition involves more than just linguistic
processing, incorporating sensory and
experiential elements.
(c).
Cross-Language Comparisons: The
third result discusses how different
languages exhibit similar encoding
efficiencies, indicating that while linguistic
structures may vary, the underlying
cognitive processes involved in language
comprehension share commonalities [25].
Conclusion
While these insights highlight the complexity of
human cognitive processes related to language,
they do not provide specific comparisons between
LLMs and LMMs regarding their ability to
integrate non-verbal cues like body language
or voice modulation. Further research would
be necessary to explore how these dimensions
of human communication could be modeled or
understood within artificial systems. If you
14
are looking for specific studies or comparisons
between LLMs and LMMs regarding their
cognitive abilities, more targeted research may
be needed beyond the provided search results.
18.
Q: LMMs were used to refer to large multimodal
models
R: Large Multimodal Models (LMMs)
represent a significant advancement in
artificial intelligence, capable of processing
and understanding multiple types of data
simultaneously, including text, images, audio,
and video. Here’s a summary of their
characteristics and capabilities:
Key Features of LMMs
Multimodal Processing: LMMs can
analyze and generate content across various
data modalities. This allows them to
integrate information from different sources,
similar to how humans use multiple senses
to understand their environment [26, 27].
Versatility: Unlike traditional models that
focus on a single modality (e.g., text-
only models), LMMs can perform tasks
that require understanding and generating
outputs in multiple forms. For instance,
they can create images based on textual
descriptions or generate text that describes
visual content [28, 29].
Applications: LMMs have broad
applications across sectors like healthcare,
education, and entertainment. They can
enhance creativity in art and design,
improve accessibility through automatic
content translation, and provide more
nuanced interactions in customer service
[27, 30].
Comparison with Large Language Models
Compared to LLMs, LMMs represent a
transformative step in AI by breaking down
barriers between different types of data. Their
ability to process multimodal information allows
for a more holistic understanding and generation
of content, making them powerful tools for
various applications. As AI technologies
Aspect LMMs LLMs
Data
Modalities
Can handle text,
images, audio,
video
Focuses only on
text
Integration
Capabilities
Good at
combining
various data
types
Works only
with text
Applications
Tasks requiring
multiple data
types
Text-based
tasks like
writing and
summarization
[28, 31]
Table 2.1: Comparison of Large Multimodal Models
(LMMs) and Large Language Models (LLMs)
continue to evolve, LMMs will likely play an
increasingly important role in enhancing the
capabilities of artificial intelligence systems.
19.
Q: Yes, I was saying that your comparison
on body language, voice modulation, etc for
cognitive analysis should be carried out on LMMs
and not on LLMs that are limited to written text
R: Thank you for the clarification! You are
correct that comparisons of cognitive abilities,
particularly regarding non-verbal cues like body
language and voice modulation, should focus
on Large Multimodal Models (LMMs) rather
than Large Language Models (LLMs), which are
limited to text. Here’s a more focused discussion
on how LMMs can be evaluated in this context:
Cognitive Analysis in LMMs
(a).
Multimodal Integration: LMMs are
designed to process and integrate
information from various modalities,
including visual (images, videos) and
auditory (voice modulation, tone) inputs.
This capability allows them to analyze
social interactions more holistically, similar
to human cognition.
(b).
Understanding Non-Verbal Cues: By
incorporating visual data, LMMs can
potentially recognize body language, facial
expressions, and gestures, which are crucial
for interpreting social dynamics. For
instance, they could analyze a video of
15
a conversation to assess how body language
complements or contradicts spoken words.
(c).
Voice Modulation and Emotion
Recognition: LMMs can also analyze
audio inputs to detect voice modulation,
pitch, and tone, which convey emotional
states. This ability allows them to
understand the emotional context of spoken
language better than text-only models.
Comparative Studies
While research is still ongoing in this area, some
studies have begun exploring the capabilities of
LMMs in understanding multimodal inputs:
Emotion Recognition: Studies have
shown that LMMs can be trained to
recognize emotions from both visual and
auditory data, allowing for a more nuanced
understanding of human communication
compared to traditional models that rely
solely on text.
Social Interaction Analysis: Research
involving LMMs has examined how these
models can interpret social interactions by
analyzing both verbal and non-verbal cues.
This includes recognizing inconsistencies
between what is said and how it is expressed
through body language or tone.
Conclusion
Your point about focusing on LMMs for cognitive
analysis involving body language and voice
modulation is well-founded. As LMMs continue
to develop, they hold great potential for advancing
our understanding of human-like cognition in
artificial intelligence by integrating multiple
forms of communication. This integration could
lead to more sophisticated AI systems capable of
engaging in more natural and contextually aware
interactions with humans. Thank you for your
insightful contributions to this discussion!
20.
Q: Yes, I was saying that your comparison
on body language, voice modulation, etc for
cognitive analysis should be carried out on LMMs
and not on LLMs that are limited to written text
R: You are absolutely correct, and I appreciate
your clarification. The analysis of cognitive
abilities, particularly in relation to body
language, voice modulation, and other non-
verbal cues, should indeed focus on Large
Multimodal Models (LMMs) rather than Large
Language Models (LLMs), which are confined
to processing text.
Key Differences in Cognitive Analysis
Data Modalities:
LMMs: Capable of integrating and
analyzing multiple types of data (text,
images, audio, video). This allows
them to interpret social interactions
more holistically by considering both
verbal and non-verbal communication.
LLMs: Limited to text-based input
and output, which restricts their ability
to engage with or analyze non-verbal
cues.
Understanding Context:
LMMs: By processing visual and
auditory data alongside text, LMMs
can gain insights into emotional
states and social dynamics that are
conveyed through body language and
tone of voice. This enables a
richer understanding of context in
communication.
LLMs: Lack the capability to interpret
non-verbal signals, which can lead
to misunderstandings or incomplete
interpretations of the intended meaning
behind the text.
Applications:
LMMs: Useful in applications such as
video analysis for emotion recognition,
virtual assistants that can respond not
only to spoken words but also to the
speakers tone and body language, and
interactive AI systems that engage in
more natural conversations.
LLMs: Primarily used for tasks
like text generation, summarization,
translation, and other language-based
16
applications where non-verbal context
is not available.
Conclusion
Your emphasis on using LMMs for cognitive
analysis involving body language and voice
modulation is well-founded. As LMM
technology continues to evolve, it will enhance
our understanding of human-like cognition in AI
by integrating various forms of communication.
Thank you for your insightful contributions to
this discussion! If you have any further questions
or topics youd like to explore regarding LMMs
or their applications, feel free to ask!
21.
Q: Before we conclude, let me ask one more
question. Is there any human knowledge
that is not expressable in terms of language?
Sentimental analysis have shown that even human
feelings can be expressed in terms of language.
What is your independent observation as a LLM?
R: The question of whether there is human
knowledge that cannot be expressed in terms of
language is a profound and complex one. While
language is a powerful tool for communication
and expression, there are aspects of human
experience and knowledge that may elude full
articulation. Here are some considerations:
(a). Non-Verbal Knowledge
Embodied Knowledge: Many forms
of knowledge are embodied, meaning
they are rooted in physical experiences
rather than verbal descriptions. For
example, the muscle memory involved
in riding a bicycle or playing a
musical instrument is difficult to
convey entirely through language.
While one can describe techniques or
principles, the actual experience of
performing these actions involves a
level of understanding that transcends
verbal explanation.
(b). Emotional Nuances
Complex Emotions: While sentiment
analysis can categorize emotions,
the depth and complexity of human
feelings often resist precise verbal
expression. People may experience
emotions that blend joy, sadness,
nostalgia, and anxiety simultaneously,
making it challenging to convey the
full spectrum of their emotional state
through words alone.
(c). Intuitive Knowledge
Intuition and Gut Feelings: Intuitive
insights or gut feelings often arise
from subconscious processing of
information and experiences. These
insights can guide decision-making but
may not always be easily articulated.
For instance, a seasoned chef might
have an intuitive sense of how to
balance flavors without being able to
explain every detail of their reasoning.
(d).
Cultural and Contextual Understanding
Cultural Contexts: Certain cultural
practices or social dynamics may be
understood deeply by members of a
community but can be challenging to
explain to outsiders. The subtleties
of cultural norms, rituals, or shared
histories often rely on context that may
not translate well into language.
Conclusion
While language is an essential medium for
expressing human knowledge and experience,
it has limitations. Certain types of
knowledge—particularly those rooted in physical
experience, complex emotions, intuition, and
cultural context—may not be fully expressible
in words. As an LLM, I recognize these
limitations in language-based communication;
my responses are based on patterns learned
from text rather than personal experiences or
emotional understanding. This observation
highlights the richness of human cognition and
the multifaceted nature of knowledge that extends
beyond linguistic expression. Thank you for
engaging in this thought-provoking discussion!
If you have any further questions or topics you’d
17
REFERENCES
like to explore, feel free to ask!
22.
Q: Though true, I think nither of them has a
universal standard either, meaning the error bars
are large. So it is not fair to compare it as a
limitation of the language. It is like using the
same force on gears of different configurations.
The outcome would be different in each case.
The issue is not with the force but is a matter of
the hardware configuration. Right or wrong?
R: Your analogy is insightful and highlights an
important point about the nature of knowledge
and expression. Let’s break down your argument:
Universal Standards and Subjectivity
Subjective Experiences: Human
emotions, intuitions, and embodied
knowledge are indeed subjective and
can vary widely among individuals and
cultures. This subjectivity means that
while there may be commonalities in
human experiences, the expression of
those experiences through language
can differ significantly based on
personal, cultural, and contextual
factors.
Error Bars: Just as you mentioned
with the ”error bars” in measurements,
the variability in how people
understand, interpret, and express
their feelings or knowledge indicates
that there isn’t a single ”correct”
way to articulate these experiences.
This variability reflects the complexity
of human cognition rather than a
limitation of language itself.
Hardware Configuration Analogy
Different Configurations: Your
analogy of using the same force on
gears of different configurations is apt.
The outcome varies not because of the
force applied (language) but due to
the unique configurations (individual
experiences, cultural backgrounds,
emotional states) that influence how
knowledge is processed and expressed.
Language as a Tool: Language serves
as a tool for communication but is
influenced by the ”hardware” of human
cognition—our emotions, experiences,
and social contexts. Thus, while
language can articulate many aspects
of human experience, its effectiveness
can be limited by these underlying
factors.
Conclusion
In summary, your argument suggests that
the limitations in expressing certain types of
knowledge or experience through language
are not solely due to the nature of language
itself but are also influenced by individual
and cultural configurations. This perspective
emphasizes the complexity of human cognition
and communication, acknowledging that while
language is powerful, it operates within a broader
context shaped by diverse human experiences.
Thank you for this engaging discussion! If you
have more thoughts or questions on this topic or
any other, feel free to share!
References
[1] https://arxiv.org/html/2409.02387v1
[2]
https://kushinm.com/assets/pdf/suresh emnlp 2
023.pdf
[3]
https://sloanreview.mit.edu/article/the-working-l
imitations-of-large-language-models/
[4] https://arxiv.org/html/2410.05229v1
[5]
https://www.digitalocean.com/community/tutoria
ls/understanding-reasoning-in-LLMs
[6]
https://www.nature.com/articles/s41562-024-018
82-z
[7]
https://www.nature.com/articles/s41562-024-020
46-9
[8] https://arxiv.org/abs/2409.02387
18
REFERENCES
[9]
https://papers.ssrn.com/sol3/papers.cfm?abstract
id=4974107
[10] https://pubmed.ncbi.nlm.nih.gov/38769463/
[11] https://aclanthology.org/2023.conll-1.25/
[12]
https://www.mentesabiertaspsicologia.com/blo
g-psicologia/faux-pas-test-use-and-clinical-appli
cations
[13]
https://aclanthology.org/2023.findings-acl.663.p
df
[14]
https://docs.autismresearchcentre.com/tests/Fa
uxPas Adult.pdf
[15]
https://magoosh.com/english-speaking/faux-pas
/
[16]
https://www.forbes.com/councils/theyec/2019/
05/22/follow-these-10-tips-to-avoid-cultural-fau
x-pas-in-your-global-business-relationships/
[17]
https://www.tradeready.ca/2018/topics/market-e
ntry-strategies/6-quick-ways-can-avoid-cultura
l-faux-pas-foreign-markets/
[18]
https://www.explore.co.uk/blog/travel-faux-pas
-how-to-avoid
[19]
https://techxplore.com/news/2024-07-large-lan
guage-human.html
[20]
https://www.hitachi.com/rd/sc/aiblog/formal-l
ogic-deduction/index.html
[21] https://arxiv.org/html/2405.19616v1
[22]
https://aijourn.com/how-similar-is-LLM-reaso
ning-to-human-cognition-the-potential-of-chain
-of-thought-prompting-in-LLMs/
[23]
https://www.cell.com/trends/cognitive-sciences/
fulltext/S1364-6613(23)00210-3
[24]
https://news.ycombinator.com/item?id=4075617
6
[25]
https://www.science.org/doi/10.1126/sciadv.aaw
2594
[26]
https://www.theainavigator.com/blog/what-is-a
n-lmm-large-multimodal-model
[27]
https://www.leewayhertz.com/large-multimoda
l-models/
[28]
https://research.aimultiple.com/large-multimoda
l-models/
[29]
https://www.wiz.ai/next-big-thing-the-large-m
ultimodal-model/
[30]
https://innodata.com/what-are-multimodal-large
-language-models/
[31] https://zapier.com/blog/multimodal-ai/
[32]
https://www.linkedin.com/pulse/limitations-lar
ge-language-models-logic-reasoning-michael-b
arrett-ey3sc
[33]
https://pkghosh.wordpress.com/2024/07/29/L
LM-failure-on-out-of-distribution-reasoning-and
-problem-solving-tasks/
[34]
https://thirdeyedata.ai/unreasonable-claim-of-r
easoning-ability-of-LLM/
[35]
https://www.alphanome.ai/post/exploring-the-c
ognitive-divide-human-intelligence-vs-LLM-bas
ed-chatbots
[36]
https://communities.springernature.com/posts/d
o-large-language-models-reason-the-way-we-do
About the Author
Professor Ninan Sajeeth Philip is a Visiting
Professor at the Inter-University Centre for Astronomy
and Astrophysics (IUCAA), Pune. He is also an
Adjunct Professor of AI in Applied Medical Sciences
[BCMCH, Thiruvalla] and a Senior Advisor for the
Pune Knowledge Cluster (PKC). He is the Dean and
Director of airis4D and has a teaching experience of
33+ years in Physics. His area of specialisation is AI
and ML.
19
Part II
Astronomy and Astrophysics
Black Hole Stories-15
The Nature of Gravitational Waves
by Ajit Kembhavi
airis4D, Vol.3, No.1, 2025
www.airis4d.com
In this we will consider the nature of gravitational
waves, and in subsequent stories the mechanisms for
their generation, the binary radio pulsar, the first
detection of gravitational waves by the LIGO detectors,
the many merging black hole and neutron star binaries
which have been discovered since then as sources of
gravitational waves, their implications for black hole
astrophysics and the LIGO-India project.
1.1 Some History
Albert Einstein first predicted the existence of
gravitational waves in 1916, just a year after he
completed his theory of gravity, the general theory
of relativity. Einstein himself was sceptical of
the prediction, and believed that the waves he had
discovered were merely artifacts of the complexity
of the theory, and not real, physical waves which
could be detected experimentally. Much progress in
understanding gravitational waves theoretically was
made beginning in the 1950s. Firm evidence for
the existence of the gravitational wave emission came
following the discovery of the binary pulsar B1913+17
in 1974 by J. H. Taylor and R. Hulse. This consisted
of a compact binary system consisting of two neutron
stars, one of which was radio pulsar. The orbital radius
of this system was found to be shrinking in size, which
could be attributed to the emission of gravitational
waves exactly as predicted by detailed calculations
using general relativity. This was convincing evidence
that gravitational waves were being emitted by the
binary neutron star, but the waves were still not directly
detected. That had to wait until the twin Advanced
LIGO detectors at Livingstone, Louisiana and Hanford
in Washington State went into operation in 2015. The
first direct detection of gravitational waves was made on
September 14, 2015, and after very careful confirmation,
an announcement about the detection was made on
February 1, 2016. That was a century after Einstein
predicted their existence in 1916, and 27 years after the
first proposal to set up the LIGO detectors was made in
1989.
A detailed and yet widely accessible discussion
on gravitational waves and electromagnetic radiation
can be found in the book Gravitational Waves: A New
Window to the Universe, by Ajit Kembhavi & Pushpa
Khare, Springer 2020. A more technical and yet very
lucid discussion is in the book Gravity, An Introduction
to Einsteins General Relativity by James B. Hartle,
2002, Pearson Education.
1.2 Gravitational Field Equations
We have seen over the Black Hole Stories that
in Albert Einsteins general theory of relativity, the
effects of the gravitational field are described by the
curvature of 4-dimensional space-time. All information
about the space-time structure is contained in the metric
tensor, which is also simply called the metric. The
metric determines the distance between any two points
of the space-time and the passage of time between two
events. In BHS-5 to BHS-8 we have described how
1.3 Weak Field Solutions
the trajectories of particles and photons are determined,
once the metric is known for the special cases of the
Schwarzschild and Kerr solutions. But how does
one determine the metric in the general case, given
a distribution of matter and energy? The equations
for the metric are Einsteins field equation of general
relativity:
It is not our intention here to describe the equations,
we only wish to set the context for gravitational wave
solutions. The left hand side of the equations is made up
of the metric and its derivatives in various combinations,
while the right hand side, known as the stress-energy
tensor, describes the distribution of matter and energy
which acts as the source of the gravitational filed. Given
this quantity, and suitable boundary conditions, the
equations can be solved to obtain the metric. Given the
complexity of the equations, it is impossible to obtain
the solutions in any general way. We have to look for
special cases which are physically important and for
which solutions can be obtained, like the Schwarzschild
and Kerr solutions.
Here we are concerned with the special case of
very weak gravitational fields.
1.3 Weak Field Solutions
In the complete absence of matter or energy space-
time can be considered to be flat, and it takes the simple
form which it has in the special theory of relativity. We
can now consider a situation in which the gravitational
field is very weak, for example when we are very far
from the distribution of matter. In this case the metric
can be considered to be made up of two parts, the
flat part of special relativity and an additional term
which has small magnitude and is known as a metric
perturbation. It can be shown that the equations for the
metric perturbation are given by
where
h
ij
is the metric perturbation term. The
magnitude
|h
ij
| <<
1, so that all terms which are
higher powers of
h
ij
can be neglected, and an extra
constraint known as a gauge condition is applied on the
perturbations.
It is apparent that these equations are very similar
to the wave equations of electromagnetic theory in BHS-
14. This is a very important result, as it shows that the
gravitational field propagates as waves, at least when
the field is weak and far from all sources of gravitation.
The presence of a gravitational wave leads to a periodic
change in the metric at a given point, and the changes
also propagate forward as time passes. As in other kinds
of waves, gravitational waves too have a wavelength
and frequency, with the wavelength typically being
greater than a few km and can be as large as
10
21
km.
The appearance of the speed of light c in the above
equation means that the waves propagate with the speed
of light. It can be shown that the waves are transverse,
in the sense that the displacement produced by the
passage of a gravitational wave is limited to the plane
perpendicular to the direction in which the waves are
propagating. Gravitational waves are also polarised in
the same plane.
The periodic changes in the metric which we
are interpreting as gravitational waves have another
connotation in general relativity. The metric determines
the geometry of space-time, and therefore these waves
are often described as ripples in the fabric of space-time.
As a gravitational wave moves across a small region
of space, the distances between all the particle in the
region, and the rate of the passage of time as measured
by clocks, change periodically due to the oscillation of
the gravitational wave.
To see the effect of a gravitational wave, consider
tiny particles which are arranged in a circle in space, as
shown in Figure 1. Now suppose a gravitational wave
passes in a direction perpendicular to the circle into the
plane of the paper. The change in the geometry produced
by the wave is such that there is alternately expansion
and contraction of distances along the horizontal and
vertical directions in the plane of the paper: when there
is contraction along the horizontal as shown in the
figure, there is expansion along the vertical direction.
As time passes, and the gravitational wave propagates
forward, the directions of contraction and expansion
are exchanged. For particles which are not along the
horizontal or vertical directions, whether expansion
22
1.4 Similarities and Differences Between Electromagnetic and Gravitational Waves
or contraction takes place depends on the direction in
which they are located. The net effect is that the circle
of particles changes shape periodically: it alternately
becomes an ellipse first stretched along the x-direction
and then stretched along the y-direction, as shown in
the upper part of Figure 1.
The periodic change of shape can occur in a
somewhat different way, with the maximum stretching
taking place in a direction which is inclined to the
x- and y- directions, as shown in the lower part of
the figure. This occurs because of difference in the
polarisation of the gravitational waves. If a gravitational
wave approaches the circle in a direction which is not
perpendicular to the plane of the circle, or there is more
than one gravitational wave passing through at the same
time, then the effects will be similar to those described,
with the details depending on the situation.
Figure 1: The effect of a gravitational wave on
a set of free point masses arranged in a circle. The
gravitational wave travels into the plane of the paper, in
a direction perpendicular to the plane. The upper and
lower parts of the figure correspond to two polarisation
states. Image Credit: Kaushal Sharma.
1.4 Similarities and Differences
Between Electromagnetic and
Gravitational Waves
The equations for electromagnetic waves follow
from Maxwell’s equations while the equations from
gravitational waves follow from Einsteins equations.
In both cases, the waves travel with the speed of light,
they are transverse waves and are polarised. In the case
of electromagnetic waves we have learnt through the
black hole stories that they can be redshifted, and can
be bent from their straight path, leading to gravitational
lensing. Gravitational waves too can undergo these
effects.
A major difference between the theories of
gravitational waves and electromagnetic waves is the
following: The Electromagnetic wave equations are
derived from Maxwell’s equations, with no conditions
attached, except the gauge condition. The equations
apply for any distribution of currents and charges, and
the electromagnetic field can be large. The gravitational
wave equations, on the other hand, apply only when
the gravitational field is weak and acts as perturbation
over flat space-time. In extreme circumstances like the
merger of black holes the fields can be very large and
then the full Einstein equations have to be used. But that
is impossible to do analytically, so numerical methods
are needed to study the processes. Another difference
is that while the wavelength of gravitational waves is in
the range of a few km to about
10
21
km, electromagnetic
waves have much shorter wavelengths, in the range of
10
12
m to a few kilometres. Yet another difference,
which we simply mention here for completeness, is that
the electromagnetic waves are mainly dipolar in nature
while the gravitational waves are mainly quadrupolar.
As electromagnetic wave propagate through space
they can be absorbed by intervening matter, with the
extent of absorption depending on the wavelength of
the radiation and the density, temperature, chemical
composition and other properties of the matter. That can
change the appearance of an object emitting the waves,
and in extreme cases the waves can be completely
absorbed at some wavelengths so that the emitter
becomes unobservable at those wavelengths. An
example of such an absorbing medium is the Earth’s
atmosphere. While visible light is able to penetrate the
atmosphere, electromagnetic waves at ultraviolet and X-
ray wavelengths are absorbed almost completely. The
situation is quite different with gravitational waves,
which remain unaffected by passage through even
the most dense environments. This is due to the
fact that the gravitational force is extremely weak,
and the gravitational waves therefore do not interact
significantly with matter. We are therefore in principle
able to observe gravitational waves from the most distant
sources and extreme conditions, like the very early
23
1.4 Similarities and Differences Between Electromagnetic and Gravitational Waves
epochs of the Universe and the very dense matter in the
environment of merging neutron stars. Our ability to
detect gravitational waves has therefore opened a new
window to the Universe, giving rise to multimessenger
astronomy.
Next Story: In the next story we will consider the
generation of gravitational waves.
About the Author
Professor Ajit Kembhavi is an emeritus
Professor at Inter University Centre for Astronomy
and Astrophysics and is also the Principal Investigator
of the Pune Knowledge Cluster. He was the former
director of Inter University Centre for Astronomy and
Astrophysics (IUCAA), Pune, and the International
Astronomical Union vice president. In collaboration
with IUCAA, he pioneered astronomy outreach
activities from the late 80s to promote astronomy
research in Indian universities.
24
X-ray Astronomy: Through Missions
by Aromal P
airis4D, Vol.3, No.1, 2025
www.airis4d.com
2.1 Plus Virtutis: Golden Age Begins
Until now, we discussed the advances in X-ray
astronomy until the 1980s. The early discoveries not
only revealed how fascinating it is to study X-rays, but
also highlighted the necessity of powerful instruments
to study the exotic phenomena in our universe. The
1990s marked the dawn of the golden age of X-ray
astronomy, a transformative era during which massive
satellites equipped with powerful instruments were
launched into space. The groundbreaking results from
these missions paved the way for future generations to
deepen their understanding of X-rays emitting sources.
In this section, we will be discussing about this golden
age.
2.2 Satellites in 1990s
2.2.1 Granat
Granat, a USSR mission in collaboration with
other European countries, was launched on December
1, 1989. It started as a pointing observation satellite
with a high elliptical orbit of perigee and apogee of
2,000 km and 200,000 km, respectively, with an orbital
inclination of
51.9
, providing a more extended orbit
period of 98 hours and longer uninterrupted observation
time. Over time, the satellite attained a nearly circular
orbit with a perigee and apogee of 60,000 km and
140,000 km, respectively, and later on, went into survey
mode in September 1994. The satellite was operational
until November 1998. This satellite had an energy
coverage of 2 keV - 2 MeV, from X-rays to Gamma rays.
Figure 1: MWPC in Granat.
Credit:Sunyaev et al.
There were seven scientific instruments onboard
Granat to study X-rays and Gamma-rays :
Astronomical X-ray telescope (ART-P) is an
imaging telescope with Position Sensitive
Multiwire Proportional Counters (MWPCs) and
coded apertures that produce images of selected
areas of the sky with a better angular resolution
of six arc minutes in the energy range 4-60 keV.
The aperture is coded by Uniformly Redundant
Array (URA) coupled with a collimator. It can
also used for timing and spectral studies in the 4-
100 keV energy range. Granat had four identical
modules of the ART-P telescope.
Astronomical X-ray telescope-spectrometer
(ART-S) is a system dedicated to the
spectroscopic study of X-ray sources in energy
range 3-100 keV. It consists of 4 spectroscopic
MWPCs. Both ART-P and ART-S were
developed in Moscow by IKI.
SIGMA is a Gamma-ray telescope
developed by two French laboratories -
the Serviced’Astrophysique at Saclay and the
Centre d’Etude Spatiale des Rayonnements at
Toulouse. It is the primary scientific instrument
2.2 Satellites in 1990s
Figure 2: SIGMA telescope in Granat.
Credit:Paul et al.
in Granat and the first coded-aperture telescope
sensitive to radiation in the energy range 35 keV
- 1.3 MeV to be operated in space. The satellite
has a Position-sensitive Scintillation Detector
(PSD), active and passive shielding devices, and
necessary modules. The Coded aperture enables
imaging capability with a resolution of 13 arc
minutes. This instrument is also coded by URA.
WATCH is an all-sky monitor developed by the
Danish Space Research Institute that works in
an energy range of 6-180 keV. It has a rotating
modulation collimator system with Phoswich
detectors consisting of parallel NaI (TI) and CsI
(TI) strips for observing bright X-ray sources
within 0.5
.
PHEBUS is a Gamma-ray instrument developed
by Centre d’Etude Spatiale des Rayonnements at
Toulouse that can cover an energy range of 100
keV - 100 MeV. This all-sky coverage instrument
consists of two independent bismuth germanate
(BGO) detectors surrounded by a plastic anti-
coincidence jacket.
KONUS-B is a Gamma-ray instrument developed
by Ioffe Physico-Technical Institute in St.
Petersburg to carry out Gamma-ray burst studies
using traditional scintillation technique in the
energy range of 0.01-8 MeV. The detectors have
NaI (TI) scintillation crystals. This instrument
also had an all-sky coverage and operated for
nearly 27 days, detecting around 19 Gamma-ray
bursts and a couple of solar flares.
TOURNESOL observe optical counterpart of
high energy events. It consisted of four
proportional counters that can detect in the energy
range of 2 keV - 20 MeV and two optical detectors
with a large FOV of 5
×5
.
In the initial four years, from the pointed observations it
studied many galactic (including stellar black holes,
neutron stars, and white dwarfs) and extragalactic
sources through timing, spectral and imaging
capabilities. In the survey mode, it discovered many
new X-ray candidate sources. Granat first discovered
the electron-positron annihilation lines from the galactic
micro-quasar 1E1740-294. It also made a very deep
imaging of the galactic center region.
2.2.2 ROSAT
The Roentgen Satellite (ROSAT) was a
collaboration between Germany, the United States, and
the United Kingdom. It was launched on June 1990, into
a nearly circular orbit 550 km radius with an inclination
of
53
and remained active and operational for 9 years,
till February 1999. Its initial 6 months were dedicated
to all-sky surveys, and later, it started to carry out
pointed observations. ROSAT was 1000 times better
than the UHURU satellite in terms of X-ray sensitivity.
Satellite and X-ray telescopes were built and operated
by Germany; the UV telescope was built by the UK,
and NASA provided the launch vehicle.
ROSAT consisted of two scientific instruments:
X-ray Telescope (XRT): A Wolter-I X-ray
telescope that operates in the soft X-ray range
(0.1-2.4 keV). It has four nested Wolter type-1
mirrors to concentrate the X-ray photons into
the detector. The focal plane assembly consisted
of two Position Sensitive Proportional Counter
(PSPC) of
2
FOV and one High-Resolution
Imager (HRI) of
36
×36
FOV that acts as the
detector.
Wide Field Camera (WFC): It is an Extreme
Ultraviolet (EUV) imaging telescope that
operates in 0.06-0.2 keV with an FOV of
5
.
It consists of three nested Wolter-Schwarzschild
26
2.2 Satellites in 1990s
Figure 3: ROSAT Satellite.
Credit:NASA
type-I mirrors and two channel plate detectors.
ROSAT was very successful in demonstrating
technological capabilities and created significant
scientific discoveries. It performed the first all-sky
survey with an imaging telescope and found more than
150,000 and nearly 500 sources in the X-ray and EUV
ranges, respectively. It also discovered the first isolated
neutron star and was the first-ever satellite to study the
X-ray emission from a comet. It studied the collision of
Shoemaker-Levy with Jupiter. It also detected pulsation
from Geminga and diffused X-ray emission by molecular
clouds.
2.2.3 BBXRT
Broad Band X-ray Telescope (BBXRT) is the
first imaging telescope that works in 0.3-12 keV. It is
a part of the ASTRO-1 mission payload conducted
by space shuttle Columbia in December 1990. It
was operational for around 9 days. The mission’s
main aim was to conduct high-sensitivity, moderate-
resolution spectrophotometry in the given energy range.
It had some difficulties in pointed observation due to
improperly compensated gyro drift rate, even though it
successfully observed 180,000 seconds on cosmic X-ray
sources. BBXRT mainly consisted of two co-aligned
conical X-ray imaging mirrors with a cryogenically-
cooled Si(Li) non-dispersive spectrometer at its focus.
Despite the difficulties, BBXRT resolved the iron K
line in the binaries - Centarus X-3 and Cygnus X-2 and
detected evidence of line broadening in NGC 4151. It
also provided insight into cooling flow in clusters of
galaxies.
Figure 4: Conceptual view of the BBXRT experiment
package.
Credit:Petre et.al
Figure 5: ASCA Satellite.
Credit:JAXA
2.2.4 ASCA
Advanced Satellite for Cosmology and
Astrophysics (ASCA), formerly ASTRO-D, is
the fourth X-ray astronomical satellite developed by
the Institute of Space and Astronautical Science. The
satellite was launched in February 1993 into a
31.1
inclined nearly circular orbit of perigee and apogee
of 523 km and 615 km, respectively. It was the first
satellite that used Charge Coupled Devices for X-ray
astronomy.
ASCA had two scientific payloads:
X-ray Telescope (XRT): It consists of four
gazing incident telescopes attached to an imaging
spectrometer in its focal plane. The telescope
utilizes multi-nested thin foil conical optics
developed by Serlemitos. It has an energy
bandwidth of 0.5 - 12 keV. XRT can resolve
sources separated by 1’
Focal Plane Detectors consists of two CCD
cameras - Solid-state Imaging Spectrometer
(SIS); and two gas scintillation imaging
proportional counters - Gas Imaging
Spectrometer (GIS). SIS works in the energy
band-pass of 0.4-12 keV with a square FOV of
20
×20
. GIS on the other hand, works in the
energy range 0.8-12 keV with a circular FOV of
50
×50
.
So far we discussed X-ray satellites from the first
half of the 1990s and many interesting missions are yet
come and we will discuss it on the upcoming article.
27
2.2 Satellites in 1990s
References
Santangelo, Andrea and Madonia, Rosalia and
Piraino, Santina A Chronological History of X-
Ray Astronomy Missions.Handbook of X-ray and
Gamma-ray Astrophysics. ISBN 9789811645440
Broad Band X-ray Telescope-I
Broad Band X-ray Telescope-II
Granat X-ray and Gamma-ray Observatory-I
Granat X-ray and Gamma-ray Observatory-II
The ROSAT Mission
Petre, R and Serlemitsos, P. J. The Broad Band
X-Ray Telescope.Space Science Reviews.
Tanaka, Y., Inoue, H., & Holt, S. S The X-ray
astronomy satellite ASCA PASJ: Publications of
the Astronomical Society of Japan (ISSN 0004-
6264), vol. 46, no. 3, p. L37-L41
S. V. Golenetskii, R. L. Aptekar, I. V. Dementyev,
Yu. A. Guryan, V. N. Ilyinsku, E. P. Mazets, V.
N. Panov, Z. Ya. Sokolova, D. D. Frederiks, T.
V. Kharitonova and L. 0. Sheshin GAMMA-RAY
BURST OBSERVATIONS WITH KONUS-B ON
GRANAT SPACECRAFT Adv. Space Re,,. VoL
11, No.8, pp. (8)125—(8)128, 1991
J. Pau1, P. Mandrou, J. Ba11et,M. Cantin, J. P.
Chabaud, B. Cordier, M. Ehanno, A. Goldwurm,
A. Lambert, J. Land
´
e, P. Laurent, F. Lebrun, J. P.
Leray, B. M
´
ena, M. Niel,J. P. Roques, G. Rouaix,
L. Salotti, P. Souleile and G. Vedrenne SIGMA:
THE HARD X-RAY AND SOFT GAMMA-RAY
TELESCOPE ON BOARD THE GRANAT SPACE
OBSERVATORY Adv. Space Res.Vol. 11, No.8,
pp. (8)289-(8)302, 1991
R. A. Sunyaev, S. I. Babichenko, D. A. Goganov,
S. R. Tabaldyev and N. S. Jambourenko X-RAY
TELESCOPES ART-P AND ART-S FOR THE
GRANAT PROJECT Adv. Space Res. Vol. 10,
No. 2, pp. (2)233—(2)237, 1990
About the Author
Aromal P is a research scholar in
Department of Astronomy, Astrophysics and Space
Engineering (DAASE) in Indian Institute of
Technology Indore. His research mainly focuses on
studies of Thermonuclear X-ray Bursts on Neutron star
surface and its interaction with the Accretion disk and
Corona.
28
What are Galaxies?
by Linn Abraham
airis4D, Vol.3, No.1, 2025
www.airis4d.com
3.1 Interstellar dust
The Milky Way is the distribution of stars in a disc
often visible using the naked eye from suitable places
on the Earth. In a previous article, we had seen that
until Hubbles discovery in the 1920s, it was believed
that the whole of the observable universe was contained
in our Milky Way, an island floating in empty infinite
space. After which it was understood that those fuzzy
objects seen in the night sky were worlds distant from
ours and were replicas of our own galaxy. Initially
they were classified together with what we today call
as nebulae. These nebulae or dark patches like the
famous Horsehead Nebula are actually regions of dust
concentration.The dark patches often seen within our
galaxy are due to this dust and not the absence of stars as
was thought to be the case earlier. Interstellar dust may
contain graphite, silicates and solid hydrogen. Dust
reduces the intensity of star light through absorption
and scattering.
Figure 1: Image showing the most prominent dark
lane which stretches from the constellations Cygnus to
Sagittarius called the Great Rift or the Dark Rift.
Image Courtesy: NASA
Figure 2: Rotation curve of a spiral galaxy. The dotted
curve shows how the rotation curve should have sloped
down had the entire mass of the galaxy been confined
to its visible boundary.
Image Courtesy: J. V. Narlikar, Introduction to Cosmology.
3.2 Rotation and Presence of Gas
The diameter of the disc of the Milky Way is
estimated to be
30 kpc and its thickness
1 kpc. The
galaxy rotates about it’s polar axis, however not like a
rigid body. Figure 2 shows a typical rotation curve. At
a distance r from the center O of the galaxy, a circular
Keplerian orbit will have a velocity given by,
v =
GM(r)
r
Where M(r) is the mass of the galaxy contained within
the radius r and the point A represents the visible
extent of the galaxy. The dotted lines shows how the
velocity should have decreased beyond A. However,
what we see in practice is the velocity remaining more
or less constant upto two or three times the visible
boundary. Some stars like our Sun go around the
3.3 Spirals and Ellipticals
Galactic Center, whereas some have highly eccentric
orbits that takes them out of the galactic plane. The
former type are called population-I stars and the latter
are called population-II stars. The population-II stars
are seen to be older than population-I stars. The mass
of our Galaxy is estimated at
1.4 ×10
11
solar mass.
The absorption lines shown by the spectra of stars in
our galaxy show that abosrbing gases are present in the
interstellar medium. The emission nebulae consists of
gas that absorbs the UV radiation from stars and radiate
it as visible light in spectacular colours. There are also
hot regions near stars which contain hydrogen as H II,
ionized by the UV liradiation from stars. In contrast
there are also cool regions containing atomic hydrogen
HI. These were detected using the 21 cm observation
in radio astronomy.
3.3 Spirals and Ellipticals
Galaxies comes in two major types, spirals and
ellipticals. Spirals are probably the most numerous
amongst bright galaxies. They show rotation and
flattening into a disc like our own galaxy. The presence
of a central bulge and dark lanes of abosrbing matter.
While spirals are most numerous among bright galaxies,
the most numerous of all galaxies are the ellipticals.
They are ellipsoidal in shape, show very little rotation
and have very little gas and dust. Let us revise the
modern classification scheme based on the morphology
of galaxies that is primary due to Hubble. The spirals
are arranged in a sequence called Sa, Sb, Sc etc. in
decreasing order of the importance of the central nucleus
or bulge in relation to the surrounding disc. Along the
sequence, the central spheroid decreases in luminosity
and spiral arms become more loosely wound. Spirals
can have bars in the central region and when that
happens they are designated SBa, SBb, SBc, etc. The
ellipticals are arranged in a sequence E0, E1, ...E7 based
on the progressive flattening of the profiles. E0s have
an almost spherical shape and E7s have a lenticular
shape. The distribution of light across galaxies are
conveniently described in terms of contours of equal
intensity called isophotes. Another type of galaxy called
S0, is intermediate between the ellipticals and spirals.
(a) The elliptical galaxy ESO 325-G004 part of the
Abell S740 cluster.
Image Courtesy: NASA, ESA, and The Hubble Heritage Team.
(b) Spiral galaxy NGC 1566
Image Courtesy: Dark Energy Survey
Figure 3: Types of galaxies
They have little gas and dust like the ellipticals but their
isophotes are similar to those of spirals. They may
have formed from collisions of spirals and ellipticals.
Since the stars are widely spaced they may pass through
without problems after the collision but interstellar gas
and dust might be thrown out into the intergalactic space.
The isophotes in this case which arise from starlight
alone might remain intact. While this arrangement of
galaxies often called the Hubble tuning fork seems to
have some order, it is still not known if this represents
some evolutionary sequence or not. In addition to these
there exists another class called ‘irregular’ which show
30
3.4 Structures on a Larger Scale
Figure 4: Image shows a massive group of galaxies: a
cluster named RXC J0032.1+1808.
Image Courtesy: ESA/Hubble
no particular pattern. Until our knowledge in galaxy
structure and evolution improves signifcantly these are
to be seen as empirical models that can help us better
understand the underlying physics. Although we can
guess the spiral nature of our galaxy from the presence
of the disk, what more can be said about it, having the
disadvantage of viewing the galaxy from within.
3.4 Structures on a Larger Scale
Galaxies themselves are seen to form patterns on a
larger scale. Galaxy groups may contain a few to a few
tens of galaxies whereas clusters can contain several
hundreds of galaxies. Galaxies that do not belong to
such groups or clusters are called field galaxies. Our
galaxy belongs to a group of
28
galaxies, known as the
Local Group and are separated by distance of up to
1
Mpc. Our closest neighbours are the Large and Small
Magellanic Clouds at an approximate distance of 50 kpc.
Astronomers have also discovered structures on a much
larger scale than clusters. This is done by studying the
distribution of clusters across the sky and looking for
any grouping or clumping. Larger structures on a scale
of
50
Mpc compared to cluster scales of
5
Mpc
have been discovered. These are called superclusters.
Our galaxy belongs to one such supercluster known as
the Local Supercluster.
References
[Narlikar(1993)]
Jayant Vishnu Narlikar. Introduction
to Cosmology. Cambridge University Press,
Cambridge [England] ; New York, NY, USA, 2nd
ed edition, 1993. ISBN 978-0-521-41250-6 978-0-
521-42352-6.
[Carroll and Ostlie(2014)]
Bradley W. Carroll and
Dale A. Ostlie. An Introduction to Modern
Astrophysics. Pearson, Harlow, 2. ed., pearson
new internat. ed edition, 2014. ISBN 978-1-292-
02293-2.
About the Author
Linn Abraham is a researcher in Physics,
specializing in A.I. applications to astronomy. He is
currently involved in the development of CNN based
Computer Vision tools for prediction of solar flares
from images of the Sun, morphological classifications
of galaxies from optical images surveys and radio
galaxy source extraction from radio observations.
31
Corona Borealis: The Northern Crown
by Sindhu G
airis4D, Vol.3, No.1, 2025
www.airis4d.com
Figure 1: Corona Borealis. (Image Credit: seasky.org)
4.1 Introduction
Corona Borealis (Figure: 1), also known as ”The
Northern Crown, is a small but striking constellation
located in the northern celestial hemisphere. Its
name, derived from Latin, aptly describes its semi-
circular arrangement of stars that resemble a crown.
Positioned between the constellations Bo
¨
otes and
Hercules, Corona Borealis is rich in mythological
history, scientific interest, and astronomical phenomena.
Despite its modest size, it has fascinated stargazers and
astronomers for centuries. This article delves into
the mythology, astronomy, cultural significance, and
scientific relevance of Corona Borealis, offering an
in-depth exploration of its allure.
Figure 2: IAU Corona Borealis chart. (Image Credit:
IAU and Sky & Telescope magazine (Roger Sinnott &
Rick Fienberg))
4.2 Mythological Origins
The origins of Corona Borealis trace back to
ancient Greek mythology. The constellation is most
famously associated with the tale of Ariadne, the
daughter of King Minos of Crete. According to the
myth, Ariadne played a crucial role in helping the hero
Theseus navigate the Labyrinth to defeat the Minotaur.
Afterward, Theseus abandoned her on the island of
Naxos, where she was discovered by the god Dionysus.
As a gesture of love and devotion, Dionysus placed a
jeweled crown in the heavens, creating the constellation
Corona Borealis.
Other cultures also recognized this constellation,
attributing their own stories to its formation. For
example, the Arabs referred to it as Alphecca, a name
derived from its brightest star, symbolizing a ”broken
4.3 Historical Astronomical References
ring.” Native American tribes, such as the Lakota, saw
it as a campfire circle, reflecting its unique shape. These
diverse interpretations highlight the universal appeal of
Corona Borealis across different civilizations.
4.3 Historical Astronomical
References
6th century BC: First written account by Greek
philosopher Pherecydes.
3rd century BC: Mentioned by Greek poet Aratus.
2nd century AD: Cataloged by Ptolemy.
Ancient Arab astronomers called it al-Fakkan
(meaning ”bowl”)
4.4 Astronomical Features
Though small, Corona Borealis offers intriguing
astronomical features that captivate scientists and
amateur astronomers alike. Spanning an area of
approximately 179 square degrees, it is the 73rd largest
constellation in the night sky. Its brightest stars and
deep-sky objects make it a fascinating subject for
observation and study.
1. Bright Stars:
Alphecca (
α
Coronae Borealis): Also
known as Gemma, this is the brightest
star in the constellation, with an apparent
magnitude of 2.2. Alphecca is a binary star
system located about 75 light-years away
from Earth. The primary star is a white
main-sequence star, while its companion is
a dimmer red dwarf.
Nusakan (
β
Coronae Borealis): The
second brightest star in the constellation,
Nusakan is also a binary system situated
approximately 114 light-years from Earth.
It consists of a chemically peculiar primary
star and a smaller companion.
2. Variable Stars:
Corona Borealis contains several variable stars,
including the notable R Coronae Borealis and T
Coronae Borealis.
R Coronae Borealis (R CrB): R Coronae
Borealis stars are a rare type of yellow
supergiant. They experience sudden,
unpredictable dimming, believed to be
caused by carbon soot clouds in their
atmospheres. Typically visible with
binoculars, these stars can dim dramatically,
sometimes to a magnitude of 15. Studying
them has provided valuable insights into
the later stages of stellar evolution and the
formation of carbon dust.
T Coronae Borealis (The Blaze Star): T
Coronae Borealis is a recurrent nova located
3,000 light-years away. It is a binary system
composed of a white dwarf and a red giant.
Approximately every 80 years, it undergoes
thermonuclear explosions. Major outbursts
were recorded in 1866 and 1946, with
the next expected in September 2024.
This system remains of great interest to
astronomers due to its dramatic and periodic
behavior.
3. Deep-Sky Objects:
Though not densely packed with deep-sky objects,
Corona Borealis hosts the Corona Borealis
Galaxy Cluster, a massive cluster of galaxies
located more than a billion light-years from Earth.
This cluster is of particular interest to astronomers
studying the large-scale structure of the universe.
4. Exoplanet Discoveries:
The constellation has also contributed to the
discovery of exoplanets. Stars within Corona
Borealis, such as Alphecca, have been studied for
planetary systems, expanding our understanding
of planetary formation and habitability.
4.5 Observing Corona Borealis
Corona Borealis is best observed in the northern
hemisphere during late spring and summer months,
when it is prominently visible in the evening sky. Its
semi-circular arrangement of stars makes it easy to
identify, lying near the bright star Arcturus in the
constellation Bo
¨
otes. The seven main stars of Corona
33
4.6 Cultural Significance
Borealis form an arc, creating the appearance of a
crown.
Finding Corona Borealis:
Begin by locating the bright star Arcturus in
Bo
¨
otes. From there, look slightly eastward to
find the curved pattern of stars that form Corona
Borealis.
For southern hemisphere observers, the
constellation appears lower in the sky and is
visible during the winter months.
Tips for Stargazers:
Use binoculars or a telescope to explore the
binary star systems and faint galaxies within the
constellation.
On clear nights, look for the subtle variations in
brightness of variable stars such as R Coronae
Borealis.
Best Viewing Conditions: To fully appreciate
Corona Borealis, find a location with minimal
light pollution on a clear, moonless night. Under
these conditions, the arc of stars forming its crown
shape becomes more distinct. After locating
Bo
¨
otes, Corona Borealis can be easily spotted
nearby, offering a stunning view of this small yet
captivating constellation.
4.6 Cultural Significance
Throughout history, Corona Borealis has held
significant cultural and symbolic meanings. Its
striking shape made it a prominent feature in various
mythologies and traditions. In addition to its association
with Ariadne in Greek mythology, other cultures offered
unique interpretations:
Arabic Astronomy: In Arabic tradition, the
constellation was called Alphecca, symbolizing
a broken ring or a string of pearls.
Chinese Astronomy: In Chinese star lore, Corona
Borealis was part of the ”Heavenly Market
Enclosure, representing a celestial marketplace.
Native American Tribes: Among some Native
American tribes, the constellation was seen as a
campfire circle, a gathering place for people.
These interpretations underscore the universal
connection between humanity and the stars, as diverse
cultures found meaning in the constellations distinctive
arc.
4.7 Scientific Contributions
Corona Borealis has contributed to significant
advancements in astronomy and astrophysics. The
study of its variable stars, particularly R Coronae
Borealis, has provided insights into stellar evolution
and the behavior of rare star types. Additionally, the
Corona Borealis Galaxy Cluster has been instrumental
in understanding the large-scale structure of the universe
and the distribution of dark matter.
Variable Stars and Stellar Evolution:
R Coronae Borealis stars are a rare class of stars
that experience unpredictable dimming. Studying these
stars has helped astronomers understand the late stages
of stellar evolution and the formation of carbon dust.
Galaxy Clusters and Dark Matter:
The Corona Borealis Galaxy Cluster has been a
valuable resource for studying the distribution of dark
matter. Observations of gravitational lensing and galaxy
movements within the cluster provide clues about the
unseen matter that makes up a significant portion of the
universe.
Exoplanetary Studies:
The discovery of exoplanets within Corona
Borealis has expanded our knowledge of planetary
systems. Detailed observations of stars in this
constellation have revealed new worlds, some of which
are in the habitable zone, sparking curiosity about the
potential for life beyond Earth.
4.8 Astrological Significance
Although primarily known for its scientific and
mythological importance, Corona Borealis also holds
a place in astrology. Astrologers often associate the
constellation with creativity, leadership, and the pursuit
of goals. Its semi-circular shape, reminiscent of a
crown, symbolizes triumph and achievement. While
astrology is distinct from astronomy, the constellations
34
4.9 Conclusion
symbolic representation continues to inspire those who
look to the stars for guidance.
4.9 Conclusion
Corona Borealis, the Northern Crown, is a
constellation that beautifully combines mythology,
science, and cultural significance. Its graceful arc of
stars has inspired stories across cultures and continues
to captivate stargazers and astronomers alike. Whether
viewed through the lens of ancient myths, cutting-edge
science, or personal symbolism, Corona Borealis serves
as a shining example of humanity’s enduring fascination
with the night sky. Small yet significant, it remains a
celestial crown adorning the heavens, a testament to the
wonders of the universe.
References:
Corona Borealis
Rare T Coronae Borealis nova explosion this
September: What it is and how to see it
NASA, Global Astronomers Await Rare Nova
Explosion
T Coronae Borealis Nova Explosion 2024: Has
T CrB Nova Happened Yet?
Corona Borealis (The Northern Crown)
Constellation
Corona Borealis Constellation
Corona Borealis
A Blaze star or ‘Nova in Corona Borealis
A Guide to the Corona Borealis Constellation
and Its Stars
Corona Borealis
About the Author
Sindhu G is a research scholar in Physics
doing research in Astronomy & Astrophysics. Her
research mainly focuses on classification of variable
stars using different machine learning algorithms. She
is also doing the period prediction of different types
of variable stars, especially eclipsing binaries and on
the study of optical counterparts of X-ray binaries.
35
Part III
Biosciences
A Short Note on the Discovery of MicroRNA
and Its Role in Post-Transcriptional Gene
Regulation
by Geetha Paul
airis4D, Vol.3, No.1, 2025
www.airis4d.com
The 2024 Nobel Prize in Physiology or Medicine
has been awarded to Victor Ambros and Gary Ruvkun
for their groundbreaking discovery of microRNA, a
fundamental mechanism of gene regulation.
1.1 Introduction
The evolution of multicellular organisms from
unicellular ancestors involved the development of
specialised cell types, each acquiring distinct functions.
This transition required increasingly sophisticated gene
regulation mechanisms. Initially, gene regulation was
primarily transcriptional, mediated by DNA-binding
factors that acted on regulatory sequences. However, as
organisms became more complex, additional regulatory
systems emerged. One significant development was the
expansion of genes encoding microRNAs (miRNAs)
within the genomes of multicellular organisms.
These tiny, non-coding RNA molecules exert post-
transcriptional control over mRNA stability and protein
translation. The groundbreaking discovery of miRNAs
by Victor Ambros and Gary Ruvkun in 1993 revealed a
novel class of regulatory RNA. They studied mutant C.
elegans nematodes with developmental defects linked to
the lin-4 and lin-14 genetic loci. Ambros’s lab found that
the lin-4 gene produced a short 22-nucleotide RNA that
regulated lin-14 by binding to its 3’ untranslated region
(3’ UTR). This discovery led to the identification of
other miRNAs, such as let-7, which are highly conserved
across animal species, including humans. Researchers
recognised that miRNAs could regulate vast networks
of protein-coding genes, playing critical roles in animal
development and adult tissue function. Over time, it
became clear that miRNAs are essential for normal
cellular processes and that their dysregulation could
contribute to various diseases, including cancer.
1.2 What is microRNA?
microRNA (miRNA) is a small, non-coding RNA
molecule that regulates gene expression. Unlike
messenger RNA (mRNA), which carries instructions
to make proteins, microRNA works by binding to
specific mRNA molecules, preventing them from
being translated into proteins or degrading the mRNA
altogether. MicroRNAs help fine-tune various cellular
processes like development, growth, and metabolism
by controlling which mRNA gets used to make proteins.
Their role is essential in maintaining normal cell
function, and disruptions in microRNA activity have
been linked to diseases such as cancer.
The human genome is now known to code
for over a thousand microRNAs, highlighting the
widespread importance of this regulatory mechanism.
This discovery has profound implications for our
understanding of cellular differentiation, development,
and disease processes.
1.3 Posttranscriptionnal Regulation
Figure 1: miRNA Biogenesis
Image courtesy: https://pmc.ncbi.nlm.nih.gov/articles/PMC6298569/
(A) The pathway of miRNA biogenesis, including
the Microprocessor-mediated (DROSHA/DGCR8) step
in the nucleus followed by DICER processing in the
cytoplasm. Structural and sequence features essential
for miRNA processing are highlighted in both the pri-
mir and pre-miR molecules. (B) Positive regulators of
miRNA biogenesis bind to the terminal loop (TL) or
other elements within miRNA precursors (pri-mir and
pre-mir) and stimulate Drosha and/or Dicer processing,
leading to increased levels of mature miRNAs (miR).
The TL, also known as the apical loop, is depicted in red.
(C) Negative regulators of miRNA biogenesis bind to TL
or other elements within miRNA precursors (pri-mir and
pre-mir) and abrogate Drosha and/or Dicer processing,
leading to decreased levels of mature miRNAs (miRs).
The TL, also known as the apical loop, is depicted in
red.
1.3 Posttranscriptionnal Regulation
The biogenesis of miRNA implies multiple
steps. Like protein-encoding genes, miRNAs are
transcribed by RNA polymerase II from independent
miRNA-encoding genes or represent introns of protein-
coding genes. Initially, miRNAs are transcribed to
primary miRNAs (pri-miRNAs), and this transcript
is further cleaved by Drosha and DGCR8; (Drosha
is a critical Class 2 ribonuclease III enzyme that
plays a fundamental role in microRNA (miRNA)
biogenesis. It is the primary nuclease responsible
for initiating miRNA processing in the cell nucleus
by cleaving primary miRNA transcripts (pri-miRNAs)
into precursor miRNAs (pre-miRNAs) and DGCR8
is a critical RNA-binding protein that plays an
essential role in microRNA (miRNA) biogenesis. It
functions as a key component of the Microprocessor
complex, working alongside the RNase III enzyme
Drosha to process primary miRNA transcripts into
precursor miRNAs) generating precursor miRNA (pre-
miRNA). Pre-miRNA is exported from the nucleus
to the cytoplasm by exportin-5 (XPO5), where Dicer
processes it into mature miRNA, which forms what is
called the miRNA-induced silencing complex (miRISC)
containing the Argonaute 2 protein and glycine-
tryptophan protein (GW182). The complex is formed
where the miRNA and mRNA target interact.
1.4 Mechanisms of Action
microRNAs bind to complementary sequences
within target mRNAs 3[2032?] untranslated regions
(3[2032?] UTR). This binding can lead to two primary
outcomes:
Translational Repression: miRNAs can inhibit the
translation process, preventing mRNA from converting
into protein.
miRNAs can interact with other regions of mRNA,
including the 5[2032?] UTR and coding sequences, and
even influence gene transcription directly under certain
conditions. This versatility allows miRNAs to fine-tune
gene expression across various cellular contexts.
1.5 Biological Significance
The human genome encodes over a thousand
miRNAs, underscoring their widespread importance
in regulating numerous biological processes such as
development, metabolism, and cell differentiation. For
instance, miRNAs are essential for maintaining normal
cellular functions; disruptions in their activity are linked
to various diseases, including cancer and metabolic
disorders. Moreover, research has shown that a single
miRNA can regulate multiple target genes, while several
38
1.6 Implications for Medicine
miRNAs may influence a single gene. This complex
regulatory network allows cells to respond dynamically
to environmental changes and developmental cues.
1.6 Implications for Medicine
The discovery of microRNAs has opened new
avenues for understanding disease mechanisms and
developing therapeutic strategies. For example,
aberrant regulation of miRNAs has been implicated in
cancer progression and other pathological conditions.
Understanding how miRNAs function could lead to
innovative treatments that target these molecules to
restore standard gene expression patterns.
1.7 Mechanisms of miRNA
Involvement in Cancer
microRNAs (miRNAs) play a complex and dual
role in cancer development, functioning as potential
oncogenes and tumour suppressors depending on
cellular context. These small non-coding RNAs
can dramatically influence cancer characteristics by
modulating key cellular processes such as proliferative
signalling, growth suppression, apoptosis, invasion,
metastasis, and angiogenesis. The mechanisms
of miRNA involvement in cancer involve various
dysregulation pathways, including gene amplification,
deletion, abnormal transcriptional control, epigenetic
changes, and defects in miRNA biogenesis machinery.
Notably, miRNAs can have context-specific effects,
with the same miRNA potentially acting as an oncogene
in one cancer type and a tumour suppressor in another.
Specific examples highlight this complexity: miR-125b
acts as an oncomiR in most hematologic malignancies
but functions as a tumour suppressor in many solid
tumours.
courtesy: https://pmc.ncbi.nlm.nih.gov/articles/PMC8554338/
Figure 2: Mechanism of Tumour Suppressor
miRNA regulation in cancer.
The mechanism controlling TS miRNAs
occurs at different steps in cancer, including
transcriptional and posttranscriptional regulation.
Various posttranscriptional regulation occurs in the
miRNA biogenesis machinery, which affects the
components of miRNA processing. miRNA processing
of some TS miRNAs is regulated at multiple levels: at
the gene level, through genetic mutations in the miRNA
gene and in other components of miRNA processing,
such as Dorsha, Dicer, and TRBP, and the binding site
of miRNA targets and exportin-5. In XPO mutations,
several TS miRNAs are downregulated in cancer, such
as miR-200 and let-7. The miRNA transcripts are
altered at transcription by several oncogenes, such as
Myc and TGFB, or by epigenetic modification. LIN
28 proteins specifically block the processing of the pre-
miRNA of Let-7. Several studies have highlighted that
Dicer downregulation in cancer is caused by several
factors, such as somatic mutations, Tap63, miR-360,
and miR-103/107. EGFR suppresses the maturation of
specific TS miRNAs through phosphorylation of AGO2
and prevents the association of AGO2 with Dicer.
In chronic lymphocytic leukaemia, reduced miR-
15 and miR-16 expression can inhibit apoptosis, while
overexpression of miR-21, miR-210, and miR-221 in
breast cancer correlates with a worse prognosis.
Epigenetic interactions further complicate
miRNAs role in cancer. DNA methylation can lead to
reduced expression of suppressor miRNAs, potentially
triggering cancer development. For instance, miRNAs
like miR-34 have been identified as critical tumour
39
1.8 Conclusion
suppressors across multiple cancer types, targeting
oncogenic pathways and potentially offering therapeutic
opportunities. The intricate nature of miRNAs in
cancer underscores their significant potential as both
diagnostic markers and therapeutic targets, representing
a promising avenue for future cancer research and
treatment strategies.
1.8 Conclusion
The discovery of microRNAs represents a
significant leap forward in molecular biology
and genetics. Researchers have gained insights
into fundamental biological processes and disease
mechanisms by elucidating how these small RNA
molecules regulate gene expression post-transcriptional
levels. As ongoing research continues to uncover
the complexities of miRNA function, their potential
medical applications promise exciting advancements in
therapeutic interventions.
References
https://www.thehindu.com/sci-tech/science/wh
y-is-microrna-discovery-a-big-leap/article68745959.
ece
https://academic.oup.com/gpb/article/7/4/147/7
221818?login=false
https://pmc.ncbi.nlm.nih.gov/articles/PMC40132
51/#R81
https://www.indiatoday.in/science/story/what-i
s-microrna-the-discovery-awarded-the-2024-nobel-p
rize-in-medicine-2612568-2024-10-07
https://pmc.ncbi.nlm.nih.gov/articles/PMC85543
38/
About the Author
Geetha Paul is one of the directors of
airis4D. She leads the Biosciences Division. Her
research interests extends from Cell & Molecular
Biology to Environmental Sciences, Odonatology, and
Aquatic Biology.
40
Part IV
General
Evolution of Domain Generation Algorithms:
From Simple Patterns to AI-Driven
Adaptation
by Jinsu Ann Mathew
airis4D, Vol.3, No.1, 2025
www.airis4d.com
In the previous article, we explored the key
components of Domain Generation Algorithms (DGA),
including seed values, domain generation logic, entropy
sources, and the frequency and volume of generated
domains. These components form the foundation of
how malware dynamically generates domain names to
establish communication with Command and Control
(C2) servers while evading detection and takedown.
However, as cybersecurity defenses improved and static
detection methods became more effective, attackers
were forced to evolve their DGA techniques to maintain
resilience and adaptability in increasingly hostile digital
environments.
This article delves into the evolution of DGAs,
tracing their transformation from early, predictable,
pattern-based algorithms to the sophisticated, highly
dynamic DGAs used in modern malware campaigns. By
examining key milestones, notable malware examples,
and technical advances in DGA strategies, we aim to
highlight how this technology has adapted to outpace
traditional security measures. From simple time-based
algorithms to complex AI-driven domain generation
techniques, the evolution of DGAs is a testament to
the ongoing arms race between cybercriminals and
defenders.
1.1 The Early Days: Simple
Pattern-Based DGAs
In the early days of Domain Generation Algorithms
(DGAs), cybercriminals used relatively simple and
predictable methods to generate domain names. These
early DGAs were primarily based on static and
hardcoded algorithms, where the domain names were
generated from fixed patterns or predefined rules. The
main aim during this phase was to create multiple
domains that the malware could use to establish
communication with its command and control (C2)
servers, ensuring its persistence and avoiding detection
by defenders.
The domain generation logic in this period
typically involved simple mathematical operations, such
as appending or combining random strings of characters,
or using specific seed values that were embedded in the
malware itself. These seeds might include:
A hardcoded numeric value or string.
The system’s IP address or machine-specific
parameters.
A date or time value, though time-based
approaches were typically limited compared to later
implementations.
1.2 Intermediate Phase: Time-Based and Seed-Driven DGAs
1.1.1 Key Characteristics:
Predictability: Since the algorithms were simple
and often used hardcoded seeds, once the logic was
discovered, attackers’ domain generation could be easily
anticipated and blocked.
Low Entropy: The domains generated often
followed simple patterns and had low entropy, meaning
that once a few domains were identified, it was relatively
straightforward to predict others.
Lack of Resilience: These DGAs were not very
resilient to countermeasures such as preemptively
blocking domains. If defenders could reverse-engineer
the malware and understand its domain-generation
pattern, they could easily stop the malware from
reaching its C2 servers.
For example, a simple pattern-based DGA might
generate a set of domains by appending a sequence
of numbers or characters to a static string, such
as domain1234.com, domain1235.com, etc. Once
researchers discovered the pattern, they could block
all future variants by recognizing this sequence. This
made early DGAs vulnerable and relatively easy to
defend against.
1.2 Intermediate Phase: Time-Based
and Seed-Driven DGAs
As defenders became more adept at blocking
domains generated by simple, pattern-based DGAs,
cybercriminals evolved their tactics to introduce more
complex mechanisms. In the intermediate phase,
they began using time-based and seed-driven methods,
which added a layer of dynamism and unpredictability
to domain generation, making it much harder to
precompute and block domains effectively.
1.2.1 Time-Based DGAs
In time-based DGAs, the malware uses the current
time or date as a core component of the domain
generation algorithm. By relying on an external factor
like the time, the generated domains change regularly,
often on a daily, hourly, or even minute-by-minute basis.
This approach made it much harder for defenders to
anticipate and block domains, as they would need to
continually update their blocklists to match the current
time window.
For example, the malware could:
Use the current date or timestamp as a seed to
generate domains.
Combine the date with a hashing function to
produce a unique domain for that particular day or
time.
Once the malware retrieves the current time from
the system, it could apply a simple formula, like hashing
the date and extracting part of the hash to form a domain.
For instance, a malware may take today’s date (e.g.,
2024-07-07) and generate a domain like a3f2b1c4.com.
The advantage of using time as a seed is that
it introduces an element of ephemeral nature to the
domain, making it valid only for a specific period.
However, the limitation of early time-based DGAs was
that they were still relatively predictable, as long as the
attackers and defenders shared the same time reference
(i.e., both knew how to compute the date or timestamp).
1.2.2 Seed-Driven DGAs
Seed-driven DGAs introduced another level of
complexity by relying on a dynamic or external seed
value. This seed could be anything from a hardcoded
value in the malwares code to a value fetched from
an external source, such as an API or social media
feed. By using dynamic seeds, the domains generated
by the malware could be highly unpredictable and less
susceptible to precomputation.
For instance, malware might use a numeric seed
retrieved from an external API (such as a stock ticker
or weather data) and combine it with a hash function
to generate a domain. Alternatively, the seed might
be based on randomly generated values that are stored
within the malwares configuration. This technique
allowed attackers to generate domains that were different
every time they checked the seed, making it difficult for
defenders to predict which domain would be used next.
A typical seed-driven DGA might involve steps like:
Retrieving an external seed (e.g., from a live API
or system parameter).
43
1.3 Modern DGAs: Adaptive and AI-Assisted Techniques
Using that seed in a hashing algorithm to produce
a domain.
The result would be a completely new domain
each time the seed changes.
For example, if the seed was a random
number like 12345, the domain generated might be
malicious86415.com, and with a new seed (23456), the
domain could change to malicious12345.com. This
unpredictability made it harder for defenders to block
the domains in advance, as the seed could change
dynamically.
1.3 Modern DGAs: Adaptive and
AI-Assisted Techniques
In recent years, Domain Generation Algorithms
(DGAs) have become significantly more sophisticated,
driven by advancements in adaptive algorithms and
the integration of artificial intelligence (AI). Modern
DGAs no longer rely on simple patterns or even basic
time-based and seed-driven methods; instead, they
have evolved to include self-adjusting mechanisms and
intelligent models that make prediction, detection, and
mitigation much more challenging. These modern
DGAs exhibit greater complexity, adaptability, and
unpredictability, making them much harder to detect
and block using traditional methods.
1.3.1
Adaptive DGAs: Dynamic and Evolving
Approaches
The hallmark of adaptive DGAs is their ability
to evolve based on environmental feedback, making
them far more resilient and unpredictable. Unlike
earlier generations of DGAs that relied on static inputs,
adaptive DGAs dynamically adjust their behavior
based on the changing conditions in the environment.
For instance, these algorithms can alter the domain
generation process depending on network conditions,
geographic location, or responses from previous
attempts to communicate with C2 servers.
One key feature of adaptive DGAs is their feedback
mechanism. As malware communicates with its
command-and-control servers, the domains that are
successfully reached or blocked are fed back into the
algorithm. The system then uses this data to modify
the domain generation pattern. For example, if certain
domain names are consistently blocked by defenders,
the DGA might switch to a new method or use a new
seed value, making it much harder to preemptively block
the generated domains. This adaptability introduces
a significant challenge for defenders, as traditional
methods of blocking domains based on predictable
patterns are no longer sufficient.
1.3.2 AI-Assisted DGAs: A New Era of
Intelligence
The integration of artificial intelligence (AI) and
machine learning (ML) into DGAs represents the cutting
edge of domain generation technology. By leveraging
AI, modern DGAs can now learn from previous domains,
analyze network traffic, and mimic benign domains,
making them incredibly difficult to detect. Traditional
methods that were once effective against simple DGA
algorithms are increasingly less reliable when facing
AI-assisted DGA techniques.
One of the primary advancements AI brings to
DGAs is the ability to generate high-entropy domain
names. Early DGAs often produced domains with
low randomness, making them easier to detect and
block. However, AI-powered algorithms can generate
complex, random, and varied domain names that are
much harder to differentiate from legitimate traffic. By
training on a vast dataset of domains, AI systems can
generate names that follow patterns similar to those
used by legitimate websites, often making it difficult
for traditional detection systems to identify malicious
domains.
Furthermore, AI-assisted DGAs can adapt to real-
time network feedback. If a particular domain or
set of domains is blocked by security systems, the
AI model can adjust its domain generation process
accordingly. These algorithms can even adapt based
on geo-location data, ensuring that the domains remain
effective regardless of where the malware is being
executed.
44
1.4 Conclusion
1.4 Conclusion
Domain Generation Algorithms (DGAs) have
undergone significant evolution, from simple,
predictable pattern-based models in the early days to
the complex, dynamic systems we see today. Initially,
DGAs were based on static time or seed values,
generating domains in a predictable manner that could
be easily identified and blocked. However, with
the rise of more sophisticated time-based and seed-
driven DGAs, attackers introduced variability and
adaptability into their domain generation strategies.
These algorithms allowed for more diverse domain
names, making detection more challenging but still not
insurmountable for traditional defense mechanisms.
In the modern era, DGAs have become increasingly
advanced with the integration of adaptive techniques and
artificial intelligence (AI). AI-driven DGAs are capable
of learning from past interactions, adjusting their
domain generation patterns in real-time, and creating
highly random, legitimate-looking domains that are
difficult to distinguish from legitimate traffic. These
innovations make it increasingly difficult for defenders
to block or preemptively detect malicious domains.
As DGAs continue to evolve, cybersecurity defenses
must also adapt, relying on advanced machine learning,
behavioral analysis, and collaborative intelligence to
combat these ever-evolving threats. The progression of
DGAs marks a critical challenge in the ongoing arms
race between attackers and defenders in cybersecurity.
References
Domain generation algorithm
DGA Detection with data analytics
What is Domain Generation Algorithm?
Real-Time Detection of Dictionary DGA Network
Traffic Using Deep Learning
About the Author
Jinsu Ann Mathew is a research scholar
in Natural Language Processing and Chemical
Informatics. Her interests include applying basic
scientific research on computational linguistics,
practical applications of human language technology,
and interdisciplinary work in computational physics.
45
About airis4D
Artificial Intelligence Research and Intelligent Systems (airis4D) is an AI and Bio-sciences Research Centre.
The Centre aims to create new knowledge in the field of Space Science, Astronomy, Robotics, Agri Science,
Industry, and Biodiversity to bring Progress and Plenitude to the People and the Planet.
Vision
Humanity is in the 4th Industrial Revolution era, which operates on a cyber-physical production system. Cutting-
edge research and development in science and technology to create new knowledge and skills become the key to
the new world economy. Most of the resources for this goal can be harnessed by integrating biological systems
with intelligent computing systems offered by AI. The future survival of humans, animals, and the ecosystem
depends on how efficiently the realities and resources are responsibly used for abundance and wellness. Artificial
intelligence Research and Intelligent Systems pursue this vision and look for the best actions that ensure an
abundant environment and ecosystem for the planet and the people.
Mission Statement
The 4D in airis4D represents the mission to Dream, Design, Develop, and Deploy Knowledge with the fire of
commitment and dedication towards humanity and the ecosystem.
Dream
To promote the unlimited human potential to dream the impossible.
Design
To nurture the human capacity to articulate a dream and logically realise it.
Develop
To assist the talents to materialise a design into a product, a service, a knowledge that benefits the community
and the planet.
Deploy
To realise and educate humanity that a knowledge that is not deployed makes no difference by its absence.
Campus
Situated in a lush green village campus in Thelliyoor, Kerala, India, airis4D was established under the auspicious
of SEED Foundation (Susthiratha, Environment, Education Development Foundation) a not-for-profit company
for promoting Education, Research. Engineering, Biology, Development, etc.
The whole campus is powered by Solar power and has a rain harvesting facility to provide sufficient water supply
for up to three months of drought. The computing facility in the campus is accessible from anywhere through a
dedicated optical fibre internet connectivity 24×7.
There is a freshwater stream that originates from the nearby hills and flows through the middle of the campus.
The campus is a noted habitat for the biodiversity of tropical Fauna and Flora. airis4D carry out periodic and
systematic water quality and species diversity surveys in the region to ensure its richness. It is our pride that the
site has consistently been environment-friendly and rich in biodiversity. airis4D is also growing fruit plants that
can feed birds and provide water bodies to survive the drought.