Cover page
Image Name: Cyber Security: Crime and cheating on the web is increasing. One should be prepared. Image
Source: AI generated image by Jinsu Ann Mathew
Managing Editor Chief Editor Editorial Board Correspondence
Ninan Sajeeth Philip Abraham Mulamoottil K Babu Joseph The Chief Editor
Ajit K Kembhavi airis4D
Geetha Paul Thelliyoor - 689544
Arun Kumar Aniyan India
Sindhu G
Journal Publisher Details
Publisher : airis4D, Thelliyoor 689544, India
Website : www.airis4d.com
Email : nsp@airis4d.com
Phone : +919497552476
i
Editorial
by Fr Dr Abraham Mulamoottil
airis4D, Vol.3, No.2, 2025
www.airis4d.com
This issue of airis4D highlights the release of
DeepSeek in January 2025. Its open-source approach
and groundbreaking advancements set new benchmarks
for AI innovation. Unlike many proprietary models,
DeepSeek champions transparency, collaboration, and
cost-effective performance. It rivals top AI models
at a fraction of the cost, making advanced AI
more accessible and energy-efficient. It reduces
operational costs and environmental impact while
achieving significant improvements in inference speed
and enhancing user experience.
The article of Blesson George on “Multimodal
Transformers” extends traditional transformer models
beyond text to process multiple data types—such as
images, audio, and video—enabling a more human-
like understanding of information. These models
use self-attention and cross-modal fusion to integrate
diverse inputs effectively. Key applications include
medical diagnosis, behaviour analysis, vision-language
tasks, and recommendation systems. Different fusion
techniques, such as early fusion, late fusion, and
cross-attention fusion, optimize how data is combined
and interpreted. Multimodal representation learning
remains a challenge, requiring smooth, structured,
and robust embeddings. Despite ongoing research,
multimodal transformers are revolutionizing AI by
making it more versatile, interpretable, and adaptable
across industries.
The article X-ray Astronomy: Through Missions
by Aromal P, explores key X-ray astronomy
missions from the 1990s that significantly advanced
our understanding of high-energy astrophysical
phenomena. It highlights NASAs Rossi X-ray Timing
Explorer (RXTE), launched in 1995, which provided
groundbreaking insights into neutron stars, black holes,
and gamma-ray bursts over its 16-year operational
period. The article also discusses Indias IRS-P3
satellite, which, despite being a remote sensing satellite,
carried the Indian X-ray Astronomy Experiment
(IXAE), contributing valuable data on X-ray binaries.
Another major mission covered is Italy’s BeppoSAX,
launched in 1996, which played a pivotal role in the
study of gamma-ray burst afterglows and high-energy
astrophysical sources. The article emphasizes how data
from these missions continue to shape X-ray astronomy
research even decades after their decommissioning.
T Corona Borealis (T CrB) by Sindhu G is a
fascinating binary star system located about 1,300
light-years away in the Corona Borealis constellation.
Comprising a white dwarf and a red giant, it is
classified as a cataclysmic variable due to its periodic
outbursts caused by mass transfer between the two stars.
First documented in 1865, T CrB undergoes dramatic
increases in brightness as the white dwarf accretes
material from its red giant companion, forming an
accretion disk that periodically ignites. With an orbital
period of approximately 31.6 days, this system provides
key insights into stellar evolution, binary interactions,
and mass transfer processes.
Observations across multiple wavelength including
optical, infrared, and X-ray help astronomers track these
outbursts and improve predictive models. Scientists
are currently monitoring T CrB in anticipation of its
next Nova explosion, expected around 2025. This study
highlights the system’s significance in understanding the
lifecycle of stars, with ongoing research offering deeper
insights into the dynamics of cataclysmic variables.
The article “Synthetic Biology A Revolutionary
Scientific Frontier by Geetha Paul highlights that
synthetic biology is a transformative field that applies
engineering principles to biological systems, treating
them as programmable entities. Unlike traditional
genetic engineering, it seeks to create entirely new
biological components and organisms with precise
control. Researchers use advanced tools like CRISPR,
DNA sequencing, and AI to manipulate genetic
material, leading to groundbreaking applications in
medicine, environmental sustainability, and industry.
Key innovations include engineered cells for targeted
therapies, microorganisms that degrade plastic waste,
and synthetic biological circuits. Ethical considerations
and regulatory frameworks are crucial to ensure
responsible development. As technology advances,
synthetic biology holds immense potential to address
global challenges in healthcare, conservation, and
industrial production.
The article ”Domain Generation Algorithm (DGA)
Detection Techniques” by Jinsu Ann Mathew explores
modern techniques for detecting domains generated
by Domain Generation Algorithms (DGAs), which
cybercriminals use to evade traditional security
measures. As DGAs become more advanced, detection
strategies have evolved to counter them. Key detection
methods include: Statistical Analysis: Which identifies
anomalies in character distribution, domain length, and
entropy to flag suspicious domains. Machine Learning
(ML): Uses supervised, unsupervised, and deep learning
models to classify domains based on learned patterns.
N-gram Analysis: Examines character sequences
to differentiate human-created domains from DGA-
generated ones. DNS Query Patterns: Monitors query
frequencies, time-based behaviours, and newly observed
domains to detect malicious activity. By integrating
these techniques, security systems can build a robust,
adaptive defence against evolving DGA threats. The
article highlights the importance of interdisciplinary
research in strengthening cybersecurity measures.
iii
News Desk
Marry Gormally from USA visited airis4D on January 29th 2025.
iv
Contents
Editorial ii
I Artificial Intelligence and Machine Learning 1
1 Multimodal Transformers 2
1.1 Types of Modalities in Machine Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Different Fusion Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
II Astronomy and Astrophysics 5
1 X-ray Astronomy: Through Missions 6
1.1 Satellites in 1990s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2 T Corona Borealis: A Comprehensive Study of a Binary Star System 10
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2 Discovery and Historical Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3 The Components of T Corona Borealis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.4 Cataclysmic Variability and Outbursts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.5 Orbital Characteristics and Periodicity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.6 The Role of T Corona Borealis in Stellar Evolution . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.7 Observations and Research on T Corona Borealis . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.8 Future Prospects and Challenges in Studying T Corona Borealis . . . . . . . . . . . . . . . . . . 12
2.9 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
III Biosciences 14
1 Synthetic Biology: A Revolutionary Scientific Frontier 15
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.2 Embedded Hybridisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.3 The Fundamental Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.4 Technological and Healthcare Innovations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.5 GES (General Environmental Solutions) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.6 Environmental and Industrial Solutions (EIS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.7 Ethical Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.8 Future Potential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.9 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
CONTENTS
IV General 20
1 Domain Generation Algorithm (DGA) Detection Techniques 21
1.1 Statistical Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.2 Machine Learning (ML) Approaches: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
1.3 N-gram Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
1.4 DNS Query Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
1.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
vi
Part I
Artificial Intelligence and Machine Learning
Multimodal Transformers
by Blesson George
airis4D, Vol.3, No.2, 2025
www.airis4d.com
In the rapidly evolving field of artificial
intelligence, transformers have revolutionized the way
machines process and understand data. Originally
designed for natural language processing, transformers
have now been extended beyond text-based tasks to
incorporate multiple modalities—such as images, audio,
and video—giving rise to multimodal transformers.
These advanced AI models are capable of processing
and integrating information from different data sources
to perform more complex and human-like understanding
of the world.
1.0.1 What Are Multimodal Transformers?
Multimodal transformers are a class of AI models
that can simultaneously process and relate multiple
types of input data. Unlike traditional unimodal models
that operate on a single type of data (e.g., text-only or
image-only models), multimodal transformers leverage
self-attention mechanisms to combine textual, visual,
and auditory information effectively. This capability
allows them to generate richer, more context-aware
predictions and responses.
1.0.2 How Do They Work?
At their core, multimodal transformers use a
shared embedding space where different data types
are transformed into a common representation. These
models employ cross-modal attention mechanisms to
allow interactions between various inputs, enhancing
the model’s ability to understand relationships between
different modalities. For example, a multimodal
transformer that analyzes an image with a caption will
align relevant words with the corresponding visual
elements to provide a coherent interpretation.
1.1 Types of Modalities in Machine
Learning
Multimodal learning involves various types of
data modalities, which can be categorized into four
main groups. Tabular data is represented as structured
datasets where observations are rows and features are
columns. Graphs are datasets where observations
are represented as vertices, with edges capturing
relationships between them. Signals include data in file
formats such as images (.jpeg), audio (.wav), and other
structured numerical representations. Sequences are
structured data such as text, where features correspond
to different elements in the sequence, including
characters, words, or documents.
In real-world applications, multimodal models
incorporate various data sources. Labels are an example
of tabular data, while reviews, titles, and descriptions
represent textual data. Images of products and movie
posters are examples of visual data, and relations
between movies, such as those seen by one user, are
represented as graphs. These diverse data sources
allow multimodal models to gain a comprehensive
understanding of information.
1.1.1 Examples of Multimodal Tasks
Multimodal learning has been successfully applied
in various domains. In medical diagnosis, multimodal
AI integrates medical images, patient demographics,
1.2 Different Fusion Techniques
and genetic data to predict diseases at an early
stage. Behavior analysis utilizes audio, video, and
physiological signals to assess human emotions and
stress levels. Vision-language tasks involve models
that perform question answering based on textual and
image data, such as identifying objects in an image
or explaining their significance. Recommendation
systems enhance user experiences by combining text
descriptions, images, and social graphs to suggest
personalized content, such as movies or fashion
recommendations.
1.1.2 Multimodal Representation Learning
Effective multimodal representation learning
involves capturing meaningful embeddings from
multiple modalities. While unimodal models like BERT
(text) and ResNet (images) have been successful in their
respective domains, a universal approach for integrating
multimodal data is still being developed. A good
multimodal representation should ensure smoothness,
meaning that similar inputs should produce similar
embeddings. It should maintain manifold structures,
where data points belonging to similar categories cluster
together. Natural clustering should allow categorical
labels to be assigned based on common features, and
sparsity should ensure that representations focus on the
most relevant features while avoiding redundancy.
Additionally, effective multimodal representation
should preserve similarities between different
modalities in a joint embedding. It should also be robust
to missing modalities, allowing meaningful inference
even when some data sources are unavailable.
1.2 Different Fusion Techniques
Multimodal transformers use different fusion
techniques to integrate multiple modalities effectively:
1.2.1 Early Fusion (Joint Embedding Space)
In this approach, different modalities (e.g., text and
image) are combined at the input stage and mapped into
a shared embedding space. This allows the model to
learn joint representations early in the process, capturing
relationships between modalities from the beginning.
This method is efficient in ensuring tight integration
between different types of input but requires large,
well-annotated datasets to work effectively.
Figure 1: Late fusion architecture
1.2.2 Late Fusion (Separate Encodings)
Here, each modality is processed separately
through independent encoders, and their representations
are combined only at the final decision-making stage.
This method is useful when individual modalities
need to be analyzed independently before integrating
their outputs. Late fusion is often more interpretable
than early fusion since decisions can be traced
back to specific modalities. It is widely used in
applications where each modality contributes distinct
and complementary information, such as combining
audio and text in sentiment analysis.
1.2.3 Cross-Attention Fusion
This technique enables direct interactions between
different modalities by using cross-attention layers,
where features from one modality influence the
representation of another. Instead of simply
concatenating features, cross-attention allows the model
to dynamically adjust its understanding of one modality
based on the other. This is particularly useful in
tasks like image captioning and video analysis, where
the model must understand which words align with
which visual components. Models like CLIP use
cross-attention mechanisms to create powerful vision-
language representations.
1.3 Conclusion
Multimodal transformers represent a
groundbreaking advancement in artificial intelligence,
pushing the boundaries of machine understanding by
integrating multiple data modalities. By combining
3
1.3 Conclusion
text, images, audio, and other forms of information,
these models offer a more comprehensive and
accurate interpretation of complex data. Their
applications across various fields, including healthcare,
human-computer interaction, and content generation,
demonstrate their transformative potential. As research
continues, the development of more efficient and
interpretable multimodal systems will further enhance
AI’s capabilities, making it more responsive and
adaptable to real-world challenges. The future of AI
lies in harnessing the power of multimodal transformers
to create smarter, more versatile, and human-like
intelligent systems.
References
1.
Multimodal Models and Fusion - A Complete
Guide
2.
Effective Techniques for Multimodal Data Fusion:
A Comparative Analysis
About the Author
Dr. Blesson George presently serves as
an Assistant Professor of Physics at CMS College
Kottayam, Kerala. His research pursuits encompass
the development of machine learning algorithms, along
with the utilization of machine learning techniques
across diverse domains.
4
Part II
Astronomy and Astrophysics
X-ray Astronomy: Through Missions
by Aromal P
airis4D, Vol.3, No.2, 2025
www.airis4d.com
1.1 Satellites in 1990s
So far, we have discussed the X-ray astronomical
satellites launched before 1995. In the previous article,
we mentioned that during the 1990s, the golden age
of astronomy had begun. Now, we will discuss more
about the satellites that contributed extensively to our
knowledge about X-ray emitting sources.
1.1.1 RXTE
The Rossi X-ray Timing Explorer (RXTE)
developed, launched, and operated by NASA was one of
the significant leaps toward X-ray astronomy. It had the
best timing resolution for an X-ray astronomical satellite
launched until then, and its timing capabilities matches
with any other instruments launch in this decade as
well. This satellite was launched from Kennedy Space
Center on December 30, 1995, using a Delta II rocket.
It was then placed in a low earth circular orbit of radius
580 km and inclination of
23
, which had an orbital
period of 90 minutes. Initially planned for 2 years,
RXTE stayed operational for 16 years till January 5,
2012, providing data that changed our views about the
neutron stars and black holes. This satellite had covered
an energy range of 2-250 keV.
Three scientific payloads were boarded in RXTE:
Proportional Counter Array (PCA): PCA
consisted of five identical Xenon-based
proportional counters, each with an effective
area of
1300 cm
2
and operated in the 2-60 keV
energy range. It had a total effective area of 6500
cm
2
and observed the sky with an FOV of
1
and
a timing resolution of 1
µ
seconds. PCAs were
Figure 1: Cut-away view of XTE Satellite.
Credit: C.A. Glasser et al. 1994
designed, built, and tested within the Goddard
Space Flight Center (GSFC) Laboratory for High
Energy Astrophysics.
High Energy X-ray Timing Experiment
(HEXTE): HEXTE consisted of eight NaI/CsI
scintillation detectors optically coupled to a
photomultiplier tube. A lead honeycomb
collimated the detectors, and they were
interdependently grouped into two clusters, each
with their own data processing systems. With
a net effective area of 1600
cm
2
, the detectors
worked in the 20-200 keV energy range with a
timing resolution of 8
µ
seconds. This instrument
was developed by the University of California at
San Diego.
All-Sky Monitor (ASM): ASM consisted of three
position-sensitive proportional cameras mounted
over a rotating pedestal. While one pointed along
the axis of rotation of RXTE, the other two rotated
perpendicular to it. The total effective area of
ASM was 90
cm
2
and worked in 2-10 keV energy.
It scanned 80% of the sky in 90 minutes and had
1.1 Satellites in 1990s
a FOV of 6
× 90
.
RXTE was an extremely successful mission, which
discovered many interesting phenomena related to
compact objects, including the kilohertz Quasi-Periodic
Oscillations (kHz QPO) and burst oscillations in
Neutron Star systems, as well as the high-frequency
oscillations in the Black Hole systems. It also
discovered the first accreting millisecond pulsar, which
proved how the radio-emitting millisecond pulsars are
formed. Later, it discovered many more accreting
millisecond pulsars. Earlier, it was believed that the
accreting millisecond pulsars did not show any burst
phenomenon due to the magnetic field confinement.
However, RXTE observed the burst phenomena from
an accreting millisecond X-ray pulsar for the first time.
RXTE also detected the afterglow of a gamma-ray burst.
Even after a decade of its decommission, RXTE data
continues to be analyzed to bring new science scenarios.
1.1.2 IRS-P3
Indian Remote Sensing Satellite-P3 was launched
on March 21, 1996, by Polar Satellite Launch
Vehicle(PSLV) into an 817 km circular orbit at
an inclination of 98.7
. Even though it was a
remote sensing satellite, it carried an experimental
X-ray astronomy instrument called the Indian X-
ray Astronomy Experiment (IXAE). IXAE payload
consisted of three identical pointed proportional
counters (PPC) and one X-ray Sky Monitor (XSM). PPC
worked in an energy range of 2-20 keV, and XSM worked
in a 2-8 keV energy range. During its operation, IXAE
provided valuable information about X-ray binaries.
1.1.3 BEPPOSAX
”Satellite per Astronomia X” (SAX, which means
Satellite for X-ray Astronomy)was an Italian satellite
developed to study high-energetic phenomena. After
the launch, the satellite was renamed in honor of
Italian physicist Giuseppe Occhialini( Giuseppe Paolo
Stanislao ”Beppo” Occhialini) as BeppoSAX. The
Mission was supported by a consortium of Italian
institutes, institutes in the Netherlands, and ESAs space
science department. BeppoSAX was launched on April
Figure 2: BeppoSAX scientific payload
accommodation
Credit: G. Boella et al. 1997
30, 1996, into a circular orbit with a radius of 600
km by Atlas-Centaur rocket. At an inclination of
3.9
and an orbital period of 96 minutes, the orbit provided
a safe escape from the South Atlantic Anomaly. It
used the screening of Earths magnetism to reduce
cosmic wave background. The satellite operated for 6
years and was decommissioned in April 2002. Using
different instruments, BeppoSAX studied high energetic
phenomena in an energy range of 0.1-300 keV. There
were 5 scientific instruments in BeppoSAX:
The Low Energy Concentrator Spectrometer
(LECS): An imaging gas scintillation
proportional counter sensitive to X-rays
in the energy range of 0.1-10 keV. It had an
effective area of 124
cm
2
and a time resolution
of 16µs.
The Medium Energy Concentrator Spectrometers
(MECS): It consisted of a grazing incidence
Mirror Unit and a position-sensitive Gas
Scintillation Proportional Counter. BeppoSAX
had three identical units of MECS that worked
in an energy range of 1.3-10 keV, with each unit
having an effective area of 124 cm
2
.
High-Pressure Gas Scintillation Proportional
Counter (HPGSPC): A Xenon-based proportional
counter(Xenon-90% Helium-10%) with an
effective area of 450
cm
2
. It operated in the
energy range of 3-120 keV.
Phoswich Detector System (PDS): It consisted
of a square array of four independent
NaI(Tl)/CsI(Na) PHOsphor sandWICH
(PHOSWICH) scintillation detectors. Each of
the four detectors was made of two crystals
7
1.1 Satellites in 1990s
of NaI(Tl) and CsI(Na), optically coupled and
forming what is known as PHOSWICH. PDS
had an effective area of 640
cm
2
and worked in
the 15-300 keV energy range.
Wide Field Camera(WFC): A coded mask
aperture developed by Space Research
Organization Netherlands(SRON). BeppoSAX
had two WFCs onboard, which worked in the
energy range of 2-30 keV, suitable for imaging
high-energy X-rays. It had an effective area of
600 cm
2
.
Among all instruments, LECS, MECS, HPGSPC,
and PDS were narrow-field instruments. One of
the main features of BeppoSAX was its broadband
coverage, starting from 0.1 keV to 300 keV. BeppoSAX
studied numerous galactic and extragalactic sources and
discovered new sources as well. It studied the afterglow
after a gamma-ray burst for the first time along with
RXTE.
References
Santangelo, Andrea and Madonia, Rosalia and
Piraino, Santina A Chronological History of X-
Ray Astronomy Missions.Handbook of X-ray and
Gamma-ray Astrophysics.ISBN 9789811645440
RXTE Image Gallery
Proportional Counter Array (PCA)
Keith Jahoda, Craig B. Markwardt, Yana Radeva,
Arnold H. Rots, Michael J. Stark, Jean H.
Swank, Tod E. Strohmayer, and William Zhang
Calibration of the Rossi X-ray Timing Explorer
Proportional Counter Array
C.A. Glasser, C.E. Odell, and S.E. Seufert he
Proportional Counter Array (PCA) Instrument
for the X-ray Timing Explorer Satellite
(XTE) IEEE TRANSACTIONS ON NUCLEAR
SCIENCE, VOL. 41, NO. 4, AUGUST 1994
R. E. Rothschild, P. R. Blanco, D. E. Gruber, W.
A. Heindl, D. R. MacDonald, D. C. Marsden, M.
R. Pelling, L. R. Wayne, and P. L. Hink In-Flight
Performance of the High Energy X-Ray Timing
Experiment on the Rossi X-Ray Timing Explorer
ApJ 496 538
Parmar, A. N, Martin, D. D. E. , Bavdaz, M. ,
Favata, F, Kuulkers, E. , Vacanti, G., Lammers,
U, Peacock, A. and Taylor, B. G. The low-
energy concentrator spectrometer on-board the
BeppoSAX X-ray astronomy satellite A & A
Supplement series, Vol. 122, April II 1997,
p.309-326.
G. Boella1, L. Chiappetti, G. Conti, G.
Cusumano, S. Del Sordo, G. La Rosa, M.C.
Maccarone, T. Mineo, S. Molendi, S. Re,
B. Sacco, and M. Tripiciano The medium-
energy concentrator spectrometer on board the
BeppoSAX X-ray astronomy satellite A & A
Supplement series, Vol. 122, April II 1997,
p.327-340.
Instrument description
The BeppoSAX Phoswich Detector System
(PDS)
R. Jager, W.A. Mels, A.C. Brinkman, M.Y.
Galama, H. Goulooze, J. Heise, P. Lowes,
J.M. Muller, A. Naber, A. Rook, R. Schuurhof,
J.J. Schuurmans, and G. Wiersma The Wide
Field Cameras onboard the BeppoSAX X-ray
Astronomy Satellite A & A Supplement series,
Vol. 125, November I 1997, p.557-572.
G. Boella, R.C. Butler, G.C. Perola, L. Piro,
L. Scarsi, and J.A.M. Bleeker BeppoSAX, the
wide band mission for X-ray astronomy A &
A Supplement series, Vol. 122, April II 1997,
p.299-307.
Indian X-ray Astronomy Experiment (IXAE)
Indian Remote Sensing Satellite IRS-P3
8
1.1 Satellites in 1990s
About the Author
Aromal P is a research scholar in
Department of Astronomy, Astrophysics and Space
Engineering (DAASE) in Indian Institute of
Technology Indore. His research mainly focuses on
studies of Thermonuclear X-ray Bursts on Neutron star
surface and its interaction with the Accretion disk and
Corona.
9
T Corona Borealis: A Comprehensive Study
of a Binary Star System
by Sindhu G
airis4D, Vol.3, No.2, 2025
www.airis4d.com
Figure 1: Corona Borealis. (Image Credit: seasky.org)
2.1 Introduction
T Corona Borealis (T CrB) is one of the most
intriguing binary star systems in the sky. Situated
approximately 1,300 light-years away from Earth in the
constellation of Corona Borealis (Figure: 1 and Figure:
2), T CrB (Figure: 3)has attracted considerable attention
in the astronomical community. Its combination of
different stellar types—especially its characteristics as a
cataclysmic variable star—makes it a fascinating subject
for both amateur and professional astronomers alike.
In this article, we will explore the star system’s
discovery, its key features, the physical characteristics
of its components, its behavior over time, and its
significance to the field of stellar evolution and
astronomical research.
Figure 2: IAU Corona Borealis chart. (Image Credit:
IAU and Sky and Telescope magazine (Roger Sinnott
and Rick Fienberg))
Figure 3: Chart showing location of T Coronae Borealis
in constellation of Corona Borealis. (Image Credit:
DE/The Guardian)
2.2 Discovery and Historical Background
2.2 Discovery and Historical
Background
The star system T Corona Borealis was first
cataloged in 1865, when astronomers discovered its
unusual variability. The system’s designation, ”T CrB,”
comes from its placement in the constellation Corona
Borealis, the Northern Crown. It is a binary system,
composed of a white dwarf and a red giant star, and
has been extensively studied due to its variability and
unique features.
The most remarkable characteristic of T CrB is its
classification as a cataclysmic variable. This means that
it experiences periodic outbursts, which are associated
with significant increases in luminosity. These outbursts,
though not as dramatic as those in some other variable
stars, are indicative of the complex interaction between
the two stars in the binary system.
2.3 The Components of T Corona
Borealis
2.3.1 The White Dwarf
At the heart of the T Corona Borealis system lies
a white dwarf star. White dwarfs are the remnants of
stars that were once similar in size to our Sun but have
exhausted their nuclear fuel and shed their outer layers.
What remains is an incredibly dense core, which is the
white dwarf.
The white dwarf in T CrB is primarily composed
of carbon and oxygen, with a mass roughly 1.2 times
that of the Sun, but with a volume similar to Earth. Its
surface temperature is high, ranging from 20,000 to
30,000 K, which makes it very hot compared to cooler
stars, such as the red giant companion.
2.3.2 The Red Giant Companion
The companion star to the white dwarf in T CrB
is a red giant. Red giants are stars that are in a later
phase of their evolution, having expanded and cooled
after exhausting their core hydrogen supply. The red
giant in T CrB is roughly 3 times the mass of the Sun
Figure 4: Artists impression of a white dwarf drawing
material away from its red giant partner. (Image Credit:
NASA/CXC/M.Weiss)
and has a much larger radius, which extends far beyond
the size of the white dwarf.
The interaction between these two stars—one a
degenerate, dense remnant and the other a bloated,
expanding giant—forms the basis of the cataclysmic
variable behavior. The red giant transfers material via
its outer layers to the white dwarf, which accretes matter
and exhibits dramatic increases in brightness during
certain intervals(Figure: 4).
2.4 Cataclysmic Variability and
Outbursts
The term cataclysmic variable refers to stars that
experience dramatic fluctuations in brightness over
time. In the case of T Corona Borealis, this variability
is driven by the mass transfer between the two stars.
The red giant is losing material to the white dwarf via
an accretion disk, and this process produces periodic
bursts of energy.
The mass transfer from the red giant creates a disk
of hot, ionized gas around the white dwarf. As this gas
spirals inward, it heats up and emits radiation, causing
the system’s brightness to fluctuate. These outbursts
typically occur on a timescale of decades, but the system
remains in a constant state of mass exchange.
11
2.5 Orbital Characteristics and Periodicity
2.5 Orbital Characteristics and
Periodicity
T Corona Borealis has an orbital period of about
31.6 days, meaning that the two stars in the system
orbit each other roughly once a month. The separation
between the two stars is around 0.15 AU (astronomical
units), which is quite close in terms of stellar systems.
Due to the proximity of the stars, gravitational
interactions play a significant role in shaping the
system’s behavior. The white dwarf pulls material
from the outer layers of the red giant, and this mass
transfer can cause temporary changes in the system’s
luminosity, as mentioned earlier. The orbital period
is important for astronomers because it helps them
determine the system’s mass and size, which provides
insight into the properties of both stars.
2.6 The Role of T Corona Borealis in
Stellar Evolution
The study of T Corona Borealis provides valuable
insights into the evolution of binary star systems. As
a cataclysmic variable, it is a key example of how
mass transfer between binary stars can influence their
development. In particular, the white dwarf in this
system is a perfect example of a star that has undergone
significant mass loss and transformation after exhausting
its fuel.
Understanding such systems also aids astronomers
in studying the fate of stars similar to the Sun. As stars
like the Sun age, they will eventually become red giants,
shedding material onto companion stars. In some cases,
this will lead to the formation of white dwarfs and other
exotic remnants.
2.7 Observations and Research on T
Corona Borealis
Astronomers use a variety of techniques to study
T Corona Borealis, from ground-based telescopes to
space-based observatories. Observations at different
wavelengths—including optical, infrared, and X-
ray—have provided insights into the complex dynamics
of the system.
For example, in X-rays, the accretion disk around
the white dwarf becomes a powerful emitter of high-
energy radiation. These emissions are often studied
using space-based telescopes like the Chandra X-ray
Observatory and the XMM-Newton satellite.
In addition, optical observations have been used
to track the periodic outbursts and better understand
the mass transfer process. Through careful monitoring,
astronomers can measure the light curves and predict
the timing of the next outbursts, which are crucial for
advancing our understanding of cataclysmic variables.
2.8 Future Prospects and Challenges
in Studying T Corona Borealis
While much has been learned about T Corona
Borealis, there is still much to discover. The complexity
of the mass transfer and accretion processes is not fully
understood, and ongoing observations will continue to
refine our understanding of the system.
One challenge is the irregularity of outbursts,
which makes long-term predictions difficult. Despite
this, advancements in technology, particularly with the
next generation of space telescopes, hold great promise
for unraveling the mysteries of cataclysmic variable
systems like T CrB.
2.9 Conclusion
T Corona Borealis is an exceptional example
of a cataclysmic variable binary star system, and
its study continues to provide valuable insights into
stellar evolution, binary interactions, and the complex
dynamics of mass transfer. Through the combination of
theoretical models and observational data, astronomers
can continue to unlock the mysteries of such systems,
enhancing our understanding of the life cycle of stars
and the nature of the universe itself.
In the next article, we will explore the anticipated
nova explosion of T Coronae Borealis (T CrB), which
scientists believe could become visible around 2025.
12
2.9 Conclusion
However, the exact timing remains uncertain, and
researchers are eagerly monitoring this event.
References:
Corona Borealis
T Coronae Borealis
Rare T Coronae Borealis nova explosion this
September: What it is and how to see it
NASA, Global Astronomers Await Rare Nova
Explosion
T Coronae Borealis Nova Explosion 2024: Has
T CrB Nova Happened Yet?
Corona Borealis (The Northern Crown)
Constellation
Corona Borealis Constellation
Corona Borealis
A Blaze star or ‘Nova in Corona Borealis
A Guide to the Corona Borealis Constellation
and Its Stars
Corona Borealis
About the Author
Sindhu G is a research scholar in Physics
doing research in Astronomy & Astrophysics. Her
research mainly focuses on classification of variable
stars using different machine learning algorithms. She
is also doing the period prediction of different types
of variable stars, especially eclipsing binaries and on
the study of optical counterparts of X-ray binaries.
13
Part III
Biosciences
Synthetic Biology: A Revolutionary Scientific
Frontier
by Geetha Paul
airis4D, Vol.3, No.2, 2025
www.airis4d.com
1.1 Introduction
Synthetic biology represents a transformative
scientific discipline that fundamentally challenges our
traditional understanding of biological systems. Unlike
conventional biological research, this innovative field
applies sophisticated engineering principles to living
organisms, treating them as complex, programmable
systems that can be systematically designed, modified,
and reconstructed. Synthetic biology seeks to transcend
the limitations of existing biological frameworks
by creating entirely new biological components and
systems that do not naturally occur in the world.
Researchers in this field view genetic material not
merely as a biological blueprint but as a programmable
language that can be manipulated with precision and
creativity. This approach represents a profound shift
from traditional genetic engineering, which typically
focuses on making incremental modifications to existing
organisms. The fundamental philosophy of synthetic
biology is rooted in the belief that biological systems can
be understood, standardised, and engineered, much like
mechanical or electronic systems. Scientists approach
living organisms as intricate machines composed of
interchangeable parts or biobricks that can be
disassembled, analysed, and reconstructed to achieve
specific, predictable functionalities. This perspective
allows researchers to design biological systems with
unprecedented levels of control and intentionality. Its
holistic and interdisciplinary approach distinguishes
synthetic biology from previous biological research.
Researchers can develop innovative solutions that
address complex challenges across multiple domains
by integrating knowledge from biology, engineering,
computer science, and computational modelling. The
field draws inspiration from engineering principles
such as modularity, standardisation, and abstraction,
applying these concepts to biological systems in ways
that were previously unimaginable. The technological
foundations of synthetic biology are built upon advanced
tools and methodologies. Cutting-edge technologies
like CRISPR gene editing, high-throughput DNA
sequencing, and sophisticated computational modelling
enable scientists to manipulate genetic material with
unprecedented precision. Machine learning algorithms
and artificial intelligence further enhance researchers
ability to predict and design complex biological
interactions. Unlike traditional genetic modification,
which typically involves minor changes to existing
organisms, synthetic biology aims to create entirely
new biological functions and organisms. This approach
opens up remarkable possibilities for addressing global
challenges in healthcare, environmental sustainability,
industrial manufacturing, and beyond.
Researchers can now design microorganisms
capable of producing complex chemicals, detecting
and treating diseases, cleaning ecological pollutants,
and developing sustainable technologies. The
potential applications of synthetic biology are vast
and transformative. In medicine, scientists develop
personalised cellular therapies, create intelligent
diagnostic tools, and design targeted treatments that can
1.2 Embedded Hybridisation
adapt to individual genetic profiles. Environmental
researchers are engineering organisms that can
consume plastic waste, capture carbon dioxide,
and produce renewable energy sources. Industrial
biotechnologists are developing more efficient and
sustainable manufacturing processes that reduce
environmental impact. However, this powerful field
also raises important ethical considerations. The ability
to create and modify life forms necessitates careful
reflection on potential risks, ecological implications,
and societal consequences.
Robust regulatory frameworks and ongoing
ethical discussions are crucial to ensuring responsible
development and application of these groundbreaking
technologies. As our understanding of genetic systems
continues to deepen and technological capabilities
expand, synthetic biology promises to revolutionise
our approach to solving complex global challenges.
By treating biology as an engineering discipline,
researchers are not just modifying life but fundamentally
reimagining its potential. This field represents
a remarkable convergence of scientific disciplines,
offering unprecedented opportunities to address some
of humanity’s most pressing environmental, medical,
and technological challenges.
1.2 Embedded Hybridisation
The embedded hybridisation mode entails
the formation of hybrid cells through physical
encapsulation, which can involve either the
encapsulation of living cells within synthetic ones
or the reverse. This process is analogous to the
evolution of eukaryotic organelles, where independent
organisms were incorporated into a host, resulting
in eukaryotic cells through a symbiotic relationship.
The emerging organelles operate in a unique
physicochemical environment, enabling them to
specialise in specific functions. In artificial cell
contexts, embedded organelle-like compartments have
been utilised for various applications, such as spatially
segregated transcription and translation, stimuli-
responsive enzymatic reactions in vesicle-based cells,
and the engineering of synthetic signalling cascades
Image Courtesy: https://onlinelibrary.wiley.com/doi/full/10.1002/anie.202006941
Figure 1: Cellular bionics. Hybrid cellular bionic
systems can be constructed by fusing living and
non-living modules together. Living modules can
be cells (prokaryotic or eukaryotic) or organelles
(e.g. mitochondria and chloroplasts). Non-living
modules can consist of artificial cell-like compartments
composed of biological and synthetic molecular
components (e.g. enzymes, membrane proteins, DNA,
and nanoparticles). The construction of artificial cells
from inanimate molecular building blocks. They are
being used as simplified cell models to decipher the
rules of life. Artificial cells have the potential to be
designed as micromachines deployed in a host of clinical
and industrial applications.
16
1.3 The Fundamental Approach
Image Courtesy: https://onlinelibrary.wiley.com/doi/full/10.1002/anie.202006941
Figure 2: Embedded hybridisation. A–C) Examples
of living cells encapsulated in synthetic ones. A)
Engineered eukaryotic cells encapsulated in an artificial
cell containing a synthetic metabolism. Coupling
the living and synthetic cells in this way resulted
in an enzymatic cascade leading to the production
of a fluorescent molecule (resorufin) within the
hybrid bioreactor.B) Microscopy images showing
chloroplasts (red) encapsulated in coacervates (green).
Chloroplasts retained their light-induced electron
transport capabilities, as demonstrated by the Hill
reagent (DPIP) reduction depicted in the schematic. C)
Schematic of a chromatophore organelle extracted from
a photosynthesising organism and inserted in a synthetic
cell. Light irradiation led to the production of ATP,
which powered the translation apparatus to produce
mRNA.An example of a synthetic cell encapsulated in a
living one. The synthetic cells performed an organelle-
like function by degrading H2O2, thus shielding the
cell from the detrimental effects of this molecule.
between compartments. Recently, a trend has been
growing to broaden this concept towards developing
hybrid living or synthetic systems.
1.3 The Fundamental Approach
The core philosophy of synthetic biology is to treat
biological systems like complex machines that can be
dismantled, understood, and reconstructed. Scientists in
this field view genetic code as a programmable language,
where DNA sequences are analogous to computer code.
By breaking down biological systems into standardised,
interchangeable parts called biobricks, researchers can
design and assemble novel genetic circuits with specific
functions.
Synthetic biology seeks to design and build new
biology that does useful things.
The field applies engineering principles of
modularity, standardisation, and abstraction
to promote rational design for industrial
applications.
DNA encodes bioparts, bioparts are combined
to make biological devices, and devices are built
into biological systems.
Synthetic biology is working towards a ‘plug-and-
play’ concept, in which off-the-shelf bioparts can
be loaded and run in finely characterised chasses.
Synthetic biology has applications in healthcare,
industrial processes, and for improving the
environment.
Creating a new life comes with a huge
responsibility. Ethical, safety, and security
implications must be considered and continually
re-assessed.
1.4 Technological and Healthcare
Innovations
Advanced technologies form the backbone of
synthetic biology. Cutting-edge tools like CRISPR
gene editing, high-throughput DNA sequencing,
and sophisticated computational modelling enable
researchers to manipulate genetic material with
unprecedented precision. Machine learning algorithms
help predict how genetic modifications might behave,
allowing scientists to design more complex and reliable
biological systems.
The potential applications of synthetic biology
are remarkably diverse and transformative. In
healthcare, researchers are developing personalised
medical treatments, creating more effective drug
delivery systems, and designing diagnostic tools that can
detect diseases at extremely early stages. Environmental
scientists are engineering microorganisms capable of
consuming plastic waste, capturing carbon dioxide, or
producing sustainable biofuels.
In the medical field, synthetic biology offers
17
1.5 GES (General Environmental Solutions)
groundbreaking possibilities. Scientists are developing
smart cellular therapies to identify and destroy cancer
cells, create personalised medications tailored to
individual genetic profiles, and design biological
sensors to detect and respond to specific medical
conditions. These innovations could revolutionise
disease treatment and prevention.
Beyond healthcare, synthetic biology is
transforming industrial processes. Researchers
are creating microorganisms that can produce
complex chemicals, pharmaceuticals, and materials
more efficiently and sustainably than traditional
manufacturing methods. Environmental applications
include developing organisms to clean up oil spills,
convert waste into valuable resources, and create more
sustainable agricultural practices. Integrating industrial
and environmental solutions is vital for businesses
aiming to maintain competitiveness while adhering to
regulatory requirements. By focusing on sustainability
and efficient resource management, companies can
comply with laws and enhance their reputation and
operational effectiveness in a rapidly evolving market
landscape.
1.5 GES (General Environmental
Solutions)
GES provides environmental consulting,
engineering, compliance, and technical field services
tailored for manufacturing and legacy industrial
contamination. They collaborate with businesses
across sectors like aerospace, chemical, pharmaceutical,
and technology to develop customised environmental
permitting, compliance, risk management, and liability
management approaches. Their integrated teams
leverage deep technical expertise to implement practical
solutions that align with business objectives.
1.6 Environmental and Industrial
Solutions (EIS)
Established in 2011, EIS specialises in halon
recovery and reclamation throughout the Middle East.
They operate the only facility in Saudi Arabia capable
of processing halons to military specifications. EIS
focuses on recycling halons from firefighting systems to
meet current standards while providing comprehensive
maintenance services for fire protection systems.
Certain companies offer a wide range of industrial
wastewater treatment equipment and chemicals. They
provide customised wastewater treatment systems,
retrofits, upgrades, and solidification equipment
designed to enhance production efficiency while
reducing costs.
Founded in 2017, Greenlab caters to testing and
certification needs within the construction industry.
They focus on providing superior laboratory equipment
and consultancy services while adhering to quality
standards and customer satisfaction.
Comtech offers extensive consulting services in
environmental compliance alongside operations and
process management support. Their expertise helps
businesses navigate complex regulatory landscapes
while optimising their operational processes.
1.7 Ethical Considerations
As with any powerful technology, synthetic
biology raises important ethical questions. The
ability to create and modify life forms necessitates
careful consideration of potential risks, environmental
impacts, and societal implications. Robust regulatory
frameworks and ongoing ethical discussions are crucial
to ensuring responsible development and application of
these technologies.
1.8 Future Potential
The future of synthetic biology is auspicious. As
our understanding of genetic systems deepens and
technological capabilities expand, we can anticipate
increasingly sophisticated biological solutions. From
creating more resilient crops to developing novel
medical treatments, synthetic biology has the potential
to address some of humanity’s most pressing challenges.
18
1.9 Conclusion
1.9 Conclusion
Synthetic biology represents a paradigm shift
in understanding and interacting with living systems.
By treating biology as an engineering discipline,
researchers are opening up unprecedented possibilities
for solving complex global challenges. As the field
continues to evolve, it promises to reshape our approach
to medicine, environmental conservation, industrial
production, and our fundamental understanding of life
itself.
NB: Part 2 in Next Volume .
Part 2 -Synthetic Biology-Recent advancements
in synthetic biology - transforming the field and
paving the way for innovative applications across
various sectors. Here are some of the latest
developments:
CRISPR-Cas9 Enhancements, DNA Synthesis
Innovations, Synthetic Cells and Organs,
Synthetic Microbes,Machine Learning and
Data Science Integration, Development of Synthetic
Vaccines,Sustainable Materials Production,
Bioremediation Techniques.
References
Benner, S., & Sismour, A. (2005). Synthetic
biology. Nature Reviews Genetics, 6(6), 533–543.
https://doi.org/10.1038/nrg1637
Garner, K. L. (2021). Principles of synthetic
biology. Essays in Biochemistry, 65(5), 791-811.
https://doi.org/10.1042/EBC20200059
Wisner S. (2021) Synthetic biology investment
reached a new record of nearly $8 billion in 2020 -
what does this mean for 2021? 2020 synbiobeta market
report
https://synbiobeta.com/synthetic-biology-
investment-set-a-nearly-8-billion-record-in-2020-
what-does-this-mean-for-2021/Accessed 24/06/2021
[Google Scholar]
https://onlinelibrary.wiley.com/doi/full/10.1002/anie.202006941
About the Author
Geetha Paul is one of the directors of
airis4D. She leads the Biosciences Division. Her
research interests extends from Cell & Molecular
Biology to Environmental Sciences, Odonatology, and
Aquatic Biology.
19
Part IV
General
Domain Generation Algorithm (DGA)
Detection Techniques
by Jinsu Ann Mathew
airis4D, Vol.3, No.2, 2025
www.airis4d.com
Domain Generation Algorithms (DGAs) generate
domains with unique characteristics to evade traditional
detection mechanisms. In the previous article,
we explored the evolution of DGAs, tracing their
progression from simple, pattern-based generators to
more complex, AI-driven systems that make detection
even more challenging. As DGAs have become more
advanced, the techniques to detect them have also
evolved, adapting to the increasingly sophisticated
nature of these threats.
This article takes a deeper dive into the modern
techniques employed to identify and mitigate DGA-
related threats. We focus on advanced detection
methods that address the unique challenges posed by
AI-enhanced DGAs, which can generate domain names
with greater complexity and variability. Some of the
cutting-edge approaches discussed include statistical
analysis, machine learning, behavioral monitoring, and
n-gram analysis, all of which are used to detect the
telltale signs of DGA activity in real-time. These
techniques not only improve detection accuracy but
also enhance our ability to combat the ever-evolving
tactics used by cybercriminals.
1.1 Statistical Analysis
Statistical Analysis for DGA Detection focuses
on leveraging mathematical and statistical properties
of domain names to distinguish between benign and
DGA-generated domains. This approach is based on
the observation that DGA domains often have distinct
patterns that differ from the typical structure of regular
domain names. The goal of statistical analysis is to
identify these differences using quantitative measures
to flag potential threats.
Character Distribution
DGA domains often have a highly irregular
character distribution compared to regular domains. For
instance, in human-registered domains, the character
distribution tends to follow certain natural language
patterns, such as vowels and consonants occurring with
typical frequencies. DGA domains, on the other hand,
often feature highly random or uniform distributions
of characters, with a heavy reliance on specific letters
or symbols generated by the algorithm. Statistical
tools can analyze the frequency of characters within a
domain name, comparing it to the expected distribution
in legitimate domains. A significant deviation can
indicate that the domain is likely generated by a DGA.
Domain Length Analysis
The length of domain names can also provide
insight into whether they are DGA-generated. Regular
domains tend to have varying lengths, but they often
fall within certain ranges (usually between 3 to 63
characters). In contrast, DGA domains may exhibit
certain length patterns—either shorter, more uniform,
or longer than typical domain names. By analyzing the
domain length distribution, statistical techniques can
flag domains that fall outside of expected ranges.
1.2 Machine Learning (ML) Approaches:
Entropy Measurement
Entropy is a measure of randomness or
unpredictability in a set of data. In the context of DGA
detection, entropy is used to assess the randomness
of a domain name’s character sequence. Human-
registered domain names often have recognizable
patterns or familiar words, which results in lower
entropy. DGA domains, generated randomly or using
specific algorithms, tend to have high entropy, meaning
they are more random and unpredictable. High entropy
values are often associated with DGA domains, making
this a useful statistical measure for detection.
1.2 Machine Learning (ML)
Approaches:
Machine Learning (ML) approaches for Domain
Generation Algorithm (DGA) detection have gained
prominence due to their ability to automatically learn
from data and adapt to evolving patterns, especially
when dealing with the more sophisticated AI-driven
DGAs. These techniques involve using labeled datasets
of both benign and DGA-generated domains to train
models that can then predict whether an unseen domain
is benign or malicious.
Supervised Learning:
Supervised learning relies on labeled datasets
containing both benign and DGA domains. Features
such as domain length, character frequency, entropy,
n-grams, and domain suffixes are extracted from domain
names to train models like decision trees, random
forests, support vector machines (SVM), or neural
networks. Once trained, these models classify new
domains with high accuracy, making them ideal for real-
time detection systems. However, their effectiveness
depends on having a diverse and comprehensive labeled
dataset.
Unsupervised Learning:
When labeled data is limited, unsupervised
learning becomes useful. Techniques like clustering
(e.g., K-means or DBSCAN) group domains with
similar characteristics, while anomaly detection
algorithms (e.g., Isolation Forest or One-Class SVM)
identify domains that deviate significantly from normal
patterns. These methods are particularly effective for
detecting previously unknown DGAs.
Deep Learning:
Deep learning models, such as Recurrent Neural
Networks (RNNs) and Long Short-Term Memory
(LSTM) networks, process domain names as sequences
of characters, capturing intricate patterns in their
structure. These models can automatically learn feature
representations, eliminating the need for manual feature
extraction. Convolutional Neural Networks (CNNs)
have also been applied, treating domains as sequences
where local patterns can be identified. Deep learning
excels in handling complex and dynamic DGAs but
requires large datasets and significant computational
resources.
Hybrid Approaches:
Hybrid approaches in DGA detection combine
the strengths of multiple machine learning techniques,
leveraging both supervised and unsupervised methods
to enhance accuracy, adaptability, and robustness. This
integration is particularly valuable in scenarios where
labeled data is incomplete or when new, unseen DGA
patterns emerge.
1.3 N-gram Analysis
N-gram Analysis is a powerful technique for
detecting domains generated by Domain Generation
Algorithms (DGAs). This method examines the
sequences of characters, or ”N-grams,” within
domain names to identify patterns that distinguish
human-generated domains from machine-generated
ones. Human-created domains, such as ”google.com”
or ”example.org, follow predictable linguistic and
semantic structures. In contrast, DGA-generated
domains like ”xkzji.com” often appear random or
22
1.4 DNS Query Patterns
unnatural, making them detectable through their distinct
N-gram patterns.
The process begins by extracting N-grams from
domain names, where an N-gram represents a sequence
of ’n consecutive characters. For example, in the
domain ”domain.com, the bigrams (2-grams) include
”do, ”om, and ”ai.” The frequency of these N-
grams is then compared to a baseline constructed from
large datasets of legitimate domains. By identifying
unusual or statistically improbable N-gram distributions,
suspicious domains can be flagged as potentially
malicious.
To enhance accuracy, the extracted N-gram
features are often fed into machine learning models
trained to classify domains as benign or DGA-generated.
Advanced techniques, such as weighting rare N-
grams more heavily or using sliding windows to
capture overlapping patterns, further refine the analysis.
Additionally, language models like Markov Chains can
predict the likelihood of an N-gram sequence, with
lower probabilities signaling potential DGA activity.
N-gram analysis is scalable, efficient, and
adaptable to different languages and scripts, making
it an essential tool for real-time detection. However,
modern DGAs, especially those powered by AI, can
mimic natural language patterns, posing challenges to
detection. Despite these challenges, N-gram analysis
remains a cornerstone of DGA detection and is often
integrated with other methods, such as DNS traffic
monitoring, to enhance overall efficacy.
1.4 DNS Query Patterns
DNS query pattern analysis is a powerful method
used to detect domains generated by Domain Generation
Algorithms (DGAs). DGAs often rely on generating a
vast number of domains to communicate with Command
and Control (C&C) servers, creating distinct patterns in
DNS queries that can be analyzed to identify malicious
activity.
Identifying Anomalous Query Frequencies
One of the key characteristics of DGA activity is
the generation of a large number of domain names, only
a small subset of which are registered and actively
used by attackers. This behavior leads to a high
volume of DNS queries, most of which result in
failed resolutions, known as NXDOMAIN responses
(non-existent domain). By analyzing DNS query
logs, security systems can detect patterns of unusually
frequent NXDOMAIN responses, which often indicate
DGA activity. Additionally, DGAs may generate
domains using uncommon or rarely observed Top-Level
Domains (TLDs), further standing out from legitimate
domain traffic. Monitoring these high query rates and
unusual TLDs allows security systems to flag suspicious
behavior and isolate potential DGA-generated domains.
Monitoring Temporal Patterns
Many DGAs operate on time-based generation
algorithms, creating new domains periodically based
on a predefined algorithm. This results in distinctive
temporal patterns in DNS queries. For instance, a bot
infected with a DGA-based malware may attempt to
resolve specific domains at consistent intervals, aligning
with the algorithm’s time-based logic. By analyzing
DNS query logs over time, it becomes possible to
identify these periodic behaviors. Domains queried in
tightly packed bursts or those that follow a predictable
daily or hourly pattern can be strong indicators of DGA
activity, as legitimate domain queries rarely exhibit
such consistent temporal regularity.
Identifying Newly Observed and Random
Domains
DGA-generated domains often differ significantly
from legitimate domains in their structure and usage
history. Most legitimate domains have a track record
of DNS queries and resolutions over time. In contrast,
DGA domains are newly generated and queried for
the first time, making them stand out in DNS traffic
logs. Additionally, DGA-generated domains often
appear random, with unusual character distributions
23
1.5 Conclusion
or nonsensical arrangements. By flagging domains
that are newly observed or lack meaningful resolution
history, security systems can isolate potential DGA
domains for further investigation.
1.5 Conclusion
In this article, we explored the diverse and
advanced techniques used to detect domains generated
by Domain Generation Algorithms (DGAs). As DGAs
have evolved from simple, rule-based approaches to
complex, AI-driven systems, detection strategies have
also advanced to counter these threats effectively.
Statistical analysis, machine learning, n-gram analysis,
and DNS query pattern analysis each play a critical role
in identifying and mitigating DGA-based activities.
Statistical methods help uncover anomalies in
domain characteristics and query frequencies, offering
a foundational layer of detection. Machine learning
techniques provide robust, adaptive solutions by training
models to identify intricate patterns indicative of
DGA behavior. N-gram analysis, focusing on the
structural patterns within domain names, excels at
identifying linguistic irregularities and randomness in
DGA-generated domains. DNS query pattern analysis,
with its focus on anomalous query rates, temporal
behaviors, and network-wide correlations, offers real-
time capabilities for detecting malicious activities at
scale.
Together, these approaches form a comprehensive
toolkit for combating the sophisticated and ever-
evolving strategies employed by cybercriminals. The
integration of multiple detection techniques ensures
a layered and adaptive defense system, capable of
addressing the unique challenges posed by AI-enhanced
DGAs. Moving forward, continued advancements
in detection strategies, coupled with interdisciplinary
research, will be essential in safeguarding networks and
systems against this persistent cybersecurity threat.
References
Explained: Domain Generating Algorithm
Domain generation algorithm
Detecting DGA Domains: Machine Learning
Approach
DGA Detection with data analytics
Detecting DGA domains with recurrent neural
networks and side information
Real-Time Detection of Dictionary DGA Network
Traffic Using Deep Learning
About the Author
Jinsu Ann Mathew is a research scholar
in Natural Language Processing and Chemical
Informatics. Her interests include applying basic
scientific research on computational linguistics,
practical applications of human language technology,
and interdisciplinary work in computational physics.
24
About airis4D
Artificial Intelligence Research and Intelligent Systems (airis4D) is an AI and Bio-sciences Research Centre.
The Centre aims to create new knowledge in the field of Space Science, Astronomy, Robotics, Agri Science,
Industry, and Biodiversity to bring Progress and Plenitude to the People and the Planet.
Vision
Humanity is in the 4th Industrial Revolution era, which operates on a cyber-physical production system. Cutting-
edge research and development in science and technology to create new knowledge and skills become the key to
the new world economy. Most of the resources for this goal can be harnessed by integrating biological systems
with intelligent computing systems offered by AI. The future survival of humans, animals, and the ecosystem
depends on how efficiently the realities and resources are responsibly used for abundance and wellness. Artificial
intelligence Research and Intelligent Systems pursue this vision and look for the best actions that ensure an
abundant environment and ecosystem for the planet and the people.
Mission Statement
The 4D in airis4D represents the mission to Dream, Design, Develop, and Deploy Knowledge with the fire of
commitment and dedication towards humanity and the ecosystem.
Dream
To promote the unlimited human potential to dream the impossible.
Design
To nurture the human capacity to articulate a dream and logically realise it.
Develop
To assist the talents to materialise a design into a product, a service, a knowledge that benefits the community
and the planet.
Deploy
To realise and educate humanity that a knowledge that is not deployed makes no difference by its absence.
Campus
Situated in a lush green village campus in Thelliyoor, Kerala, India, airis4D was established under the auspicious
of SEED Foundation (Susthiratha, Environment, Education Development Foundation) a not-for-profit company
for promoting Education, Research. Engineering, Biology, Development, etc.
The whole campus is powered by Solar power and has a rain harvesting facility to provide sufficient water supply
for up to three months of drought. The computing facility in the campus is accessible from anywhere through a
dedicated optical fibre internet connectivity 24×7.
There is a freshwater stream that originates from the nearby hills and flows through the middle of the campus.
The campus is a noted habitat for the biodiversity of tropical Fauna and Flora. airis4D carry out periodic and
systematic water quality and species diversity surveys in the region to ensure its richness. It is our pride that the
site has consistently been environment-friendly and rich in biodiversity. airis4D is also growing fruit plants that
can feed birds and provide water bodies to survive the drought.