Cover page
Image Name: Way of Science: Hubble Space Telescope observations of a pair of very distant exploding stars,
called Type Ia supernovae (red spot at the centre near the galaxy), provide new clues about the accelerating
universe and its mysterious ”dark energy.” Astronomers used the telescopes Advanced Camera for Surveys to
help pinpoint the supernovae, which are approximately 5 billion and 8 billion light-years from Earth. The farther
one exploded so long ago the universe may still have been decelerating under its own gravity. For full story:
https://hubblesite.org/contents/news-releases/2003/news-2003-12.html
Managing Editor Chief Editor Editorial Board Correspondence
Ninan Sajeeth Philip Abraham Mulamoottil K Babu Joseph The Chief Editor
Ajit K Kembhavi airis4D
Geetha Paul Thelliyoor - 689544
Arun Kumar Aniyan India
Sindhu G
Journal Publisher Details
Publisher : airis4D, Thelliyoor 689544, India
Website : www.airis4d.com
Email : nsp@airis4d.com
Phone : +919497552476
i
Editorial
by Fr Dr Abraham Mulamoottil
airis4D, Vol.3, No.3, 2025
www.airis4d.com
The cover page of this edition is the image from
the Hubble Space Telescope that captures observations
of two distant Type Ia supernovae (red spot near the
galaxy centre), located approximately 5 billion and 8
billion light-years from Earth. These exploding stars
provide critical insights into the universe’s accelerating
expansion and the mysterious force known as ”dark
energy.” The farther supernovas explosion occurred
when the universe might still have been decelerating
due to gravity.
In ”Black Hole Stories-16, Ajit Kembhavi
explores the physics of gravitational wave generation
and their cosmic sources. Gravitational waves
are produced by accelerated or decelerated matter
and energy, described by the stress-energy tensor,
analogous to electromagnetic waves from accelerated
charges. However, due to gravity’s weak interaction,
only massive cosmic events can generate detectable
waves. Key sources include: (1) burst sources (e.g.,
supernovae), (2) continuous sources (e.g., asymmetric
spinning objects), (3) binary systems (e.g., neutron
stars, black holes), and (4) stochastic backgrounds.
Binary systems, particularly those with compact objects
like black holes or neutron stars, lose energy through
gravitational waves, causing orbital decay and eventual
merger. These mergers, detected by observatories like
LIGO, provide critical insights into gravitational wave
physics. The article sets the stage for future discussions
on binary pulsars and gravitational wave discoveries.
Aromal P’s article ”X-ray Astronomy: Through
Missions” underscores the transformative influence of
X-ray astronomy missions in the 1990s. Significant
satellites such as ARGOS, the Chandra X-ray
Observatory, and XMM-Newton revolutionised our
understanding of high-energy phenomena. ARGOS
(1999): Featured the Unconventional Stellar Aspect
(USA) experiment, offering unprecedented X-ray timing
capabilities for studying Low Mass Binaries. Chandra
(1999): Launched with advanced optics and instruments
like HETG, LETG, HRC, and ACIS, Chandra provided
high-resolution imaging and spectroscopy. It made
groundbreaking discoveries, including evidence for dark
matter, X-ray pulsations from Jupiter, and resolved the
cosmic X-ray background into discrete sources. XMM-
Newton (1999): Equipped with EPIC, RGS, and OM
instruments, it enabled broadband analysis from visible
to X-ray ranges. Key achievements include measuring
supermassive black hole spin rates, studying tidal
disruption events, and mapping dark matter distribution.
These missions laid the foundation for modern X-ray
astronomy, inspiring future explorations and expanding
our knowledge of the universe.
The article ”Supernova: The Explosive Death
of Stars” by Sindhu G explores the dramatic and
transformative role of supernovae in the universe.
Supernovae are powerful stellar explosions that mark
the end of a star’s life, releasing immense energy
and dispersing heavy elements essential for cosmic
evolution. They are classified into two main types:
Type I: Lacks hydrogen in its spectrum and includes
Type Ia (white dwarf explosions in binary systems),
Type I b, and Type I c (massive stars that shed their
outer layers). Type II: Retains hydrogen and results from
the core collapse of massive stars, with subcategories
like II-P, II-L, and IIn. Supernovae are driven by either
thermonuclear explosions (Type I a) or core-collapse
(Type II, Ib, Ic). Their aftermath includes the formation
of neutron stars, pulsars, black holes, and supernova
remnants that enrich the interstellar medium. These
explosions are vital for: Element formation: Producing
heavy elements like iron and gold. Cosmic distance
measurement: Type Ia supernovae serve as ”standard
candles.” Star formation: Shock waves trigger the
birth of new stars. Planetary systems: Influencing the
chemical composition of planets. Notable historical
supernovae, such as SN 1054 (Crab Nebula) and SN
1987A, have provided critical insights into stellar
evolution. Future advancements, like the Vera C.
Rubin Observatory, promise to detect thousands of
supernovae annually, furthering our understanding of
the universe. Supernovae are not just destructive events
but are fundamental to the cosmic life cycle, shaping
the universe and enabling the existence of planets and
life.
The article ”Synthetic Biology: A Revolutionary
Scientific Frontier - Part 2” by Geetha Paul
explores recent advancements in synthetic biology,
a field that combines biology, engineering, and
computational design to reprogram organisms for
innovative applications. Key breakthroughs include:
CRISPR-Cas9 Enhancements: Improved precision
in gene editing with tools like base editing, prime
editing, and Cas12/Cas13 variants for RNA editing and
diagnostics. DNA Synthesis Innovations: Advances
in enzymatic and chip-based DNA synthesis have
reduced costs and enabled the creation of longer,
more complex genetic sequences. Synthetic Cells
and Organs: Progress in creating artificial cells
and organoids for regenerative medicine and drug
testing. Synthetic Microbes: Engineered microbes for
biofuel production, pathogen detection, and therapeutic
applications. Machine Learning Integration: AI
tools like AlphaFold optimize protein design, genetic
circuits, and metabolic pathways. Synthetic Vaccines:
Development of mRNA vaccines (e.g., COVID-19) and
exploration of synthetic vaccines for cancer and HIV.
Sustainable Materials: Production of biodegradable
plastics, bio-based textiles, and lab-grown leather
using engineered microorganisms. Bioremediation:
Engineered microbes and plants for cleaning pollutants
like plastic waste and heavy metals. These
advancements highlight synthetic biology’s potential to
address global challenges in healthcare, sustainability,
and environmental remediation. However, the
field raises ethical, social, and regulatory concerns,
necessitating responsible use and equitable distribution
of benefits. Synthetic biology promises to transform
industries and improve lives, paving the way for a
healthier, more sustainable future.
The article ”Principal Component Analysis” by
Linn Abraham explains Principal Component Analysis
(PCA), a fundamental technique in machine learning
used for feature extraction, dimensionality reduction,
and data visualization. PCA identifies a smaller set
of features (principal components) that capture the
maximum variance in the data, reducing noise and
improving machine learning results. Key steps in
PCA include: 1. Covariance Matrix: Computes how
variables in the dataset vary together. 2. Eigenvectors
and Eigenvalues: Eigenvectors represent directions
of maximum variance, and eigenvalues indicate the
magnitude of variance along these directions. 3.
Principal Components: The eigenvectors corresponding
to the largest eigenvalues are the principal components,
which form a new basis for the data. 4. Dimensionality
Reduction: By selecting a subset of principal
components that explain most of the variance, the data
can be represented in a lower-dimensional space.
The PCA algorithm involves: Centering the data
by subtracting the mean. Computing the covariance
matrix. Finding eigenvectors and eigenvalues. Sorting
and selecting components based on eigenvalues. and
Projecting the data onto the selected components. PCA
is widely used in various fields, including astronomy,
for tasks like galaxy classification and solar flare
prediction. The article provides a mathematical
foundation and practical steps for implementing PCA,
making it a powerful tool for simplifying and analyzing
complex datasets. The article An Introduction to
Parallel Computing” by Ajay Vibhute explores the
evolution of computing from its early beginnings with
Charles Babbages Analytical Engine to modern parallel
computing systems. Parallel computing, which enables
multiple tasks to be executed simultaneously, has
iii
become essential for achieving high performance and
efficiency in today’s systems.
The article focuses on two key memory
architectures used in parallel computing: 1. Shared
Memory Architecture: All processors access a single
memory pool, simplifying data exchange but requiring
synchronization to avoid conflicts. Divided into
Uniform Memory Access (UMA), where all processors
have equal access time, and Non-Uniform Memory
Access (NUMA), where processors have faster access
to local memory. Limited scalability due to memory
contention as the number of processors increases.
Distributed Memory Architecture: Each processor
has its own dedicated memory, improving scalability
and fault tolerance. Requires explicit communication
between processors, often using protocols like Message
Passing Interface (MPI). Includes compute clusters,
grid computing, and cloud computing, which allow
systems to scale by adding more processors and memory
units. The article concludes that the choice between
shared and distributed memory architectures depends
on the application’s needs, scalability requirements, and
available resources. A hybrid approach combining both
architectures is often used to optimize performance and
resource utilization in high-performance systems.
The article ”Understanding Cosine Similarity:
A Mathematical Perspective” by Jinsu Ann Mathew
explores the mathematical foundations and practical
applications of cosine similarity, a key metric in data
science and machine learning. Cosine similarity
measures the alignment between two vectors by
calculating the cosine of the angle between them,
focusing on direction rather than magnitude. This
makes it particularly useful in high-dimensional
spaces, such as natural language processing (NLP),
recommendation systems, and clustering algorithms.
Mathematically, cosine similarity is derived from the
dot product of vectors normalized by their magnitudes,
ensuring scale invariance. It ranges from -1 (opposite
directions) to 1 (identical directions), with 0 indicating
orthogonality. The article highlights its geometric
interpretation, comparing it to vector interactions in
physics, and demonstrates its utility in text analysis,
where documents are represented as vectors. Practical
examples illustrate how cosine similarity can identify
similar or unrelated documents based on shared terms.
The author concludes that cosine similarity is a versatile
and powerful tool for quantifying relationships in
data, making it indispensable in fields like NLP,
recommendation systems, and anomaly detection.
Its ability to focus on directional alignment rather
than absolute values ensures its relevance in both
computational and real-world applications.
iv
News Desk
Humanity at Cross Roads?
Students from Christian College, Chengannur doing a project in the biosciences lab.
Dr Biju K.G from WMO College, Wayanad and Dr Padmakumar from MG College, Trivandrum on their visit to
airis4D.
v
Contents
Editorial ii
I Astronomy and Astrophysics 1
1 Black Hole Stories-16
The Generation of Gravitational Waves 2
1.1 Generation of Gravitational Radiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2 X-ray Astronomy: Through Missions 5
2.1 Satellites in 1990s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3 Supernova: The Explosive Death of Stars 9
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.2 What is a Supernova? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.3 Types of Supernovae . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.4 Causes of Supernovae . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.5 The Aftermath of a Supernova . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.6 The Role of Supernovae in Cosmic Evolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.7 Importance of Supernovae in Astronomy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.8 Observing Supernovae . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.9 Notable Supernovae in History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.10 Future Research and Supernova Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.11 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
II Biosciences 12
1 Synthetic Biology: A Revolutionary Scientific Frontier - Part 2
Recent Advancements in Synthetic Biology 13
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.2 A Brief Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
III Computer Programming 19
1 Principal Component Analysis 20
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.2 Covariance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.3 Eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.4 Principal Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.5 The PCA algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
CONTENTS
2 An Introduction to Parallel Computing 23
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.2 Memory Architectures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3 Understanding Cosine Similarity: A Mathematical Perspective 26
3.1 Mathematical Formula for Cosine Similarity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.2 Geometric Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.3 Cosine Similarity in Data Science . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.4 Understanding Cosine Similarity Through a Practical Example . . . . . . . . . . . . . . . . . . . 28
3.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
vii
Part I
Astronomy and Astrophysics
Black Hole Stories-16
The Generation of Gravitational Waves
by Ajit Kembhavi
airis4D, Vol.3, No.3, 2025
www.airis4d.com
In this story we will consider the physics of the
generation of gravitational waves, and the nature of the
cosmic sources which generate detectable gravitational
waves.
1.1 Generation of Gravitational
Radiation
In electromagnetism, electric charges and currents
are the source of electric and magnetic fields. A
distribution of static electric charges produces a static
electric field, while a steady electric current produces
a steady magnetic field. The situation becomes much
more interesting when electric charges are accelerated,
or there are changing electric currents, since these
produce electromagnetic waves which travel away from
the sources, carrying energy and information to distant
parts.
We have found in BHS-15 that for weak
gravitational fields, Einsteins field equations of
gravitation take the form of a wave equation, as in
the case of electromagnetism. The source of the
gravitational waves here are matter and energy, which
are described by a quantity known as the stress energy
tensor. Matter at rest or in uniform motion in a given
frame of reference does not produce gravitational
waves, just as electric charges at rest or moving
uniformly do not produce electromagnetic waves. It
is matter in accelerated (or decelerated) motion and
energy which act as sources of gravitational waves.
While every object, however small its mass, would
radiate gravitational waves when accelerated, the very
weak nature of the gravitational interaction means that
such waves produce by terrestrial objects would be
impossible to detect. This is very different from the case
with electromagnetism, where the interaction between
electromagnetic waves and charged particles is so strong
that these waves can be easily produced and detected.
So a simple cell phone can communicate with an Earth
orbiting satellite, and we can detect signals sent out by
satellites at the far reaches of the Solar system.
The nature of the sources which can emit
gravitational radiation follows from a detailed analysis
of the gravitational wave equation. It can be shown,
for example, that an expanding or contracting uniform
sphere of matter, accelerating during the expansion and
contraction, will not emit gravitational waves, because
of the spherical symmetry. Similarly, a cylinder or a
disc of matter or a dumbbell which is rotating around
its axis of symmetry cannot emit gravitational radiation
because of the cylindrical symmetry. But the same
dumbbell rotating around an axis perpendicular to
its length can emit gravitational radiation as shown
in Figure 1. The emitted gravitational radiation has
different components, the most important of which is
called quadrupole radiation. The other components
present are much weaker than the quadrupole part.
1.1 Generation of Gravitational Radiation
Figure 1: A dumbbell which is symmetric around the
axis A and is asymmetric around axis B. It will not emit
gravitational waves if it rotates around axis A but will
emit these waves while rotating around axis B.
Image Credit: Kaushal Sharma.
We have to depend on cosmic sources for
potentially detectable gravitational wave emission.
These sources are mainly of four types: (1) Burst
sources, like supernova explosions and Gamma-ray
bursts, which are short-duration one off events; (2)
Continuous sources, like spinning, compact objects
with some asymmetry which can emit gravitational
waves at the same frequency for a long duration; (3)
Binary sources with the two components, each being a
compact object which is a white dwarf, neutron star or a
black hole; (4) Stochastic background sources which are
collections of a large number of weak sources, which
cannot individually be detected but can be collectively
be observed as a background of gravitational waves.
Here we will consider only stellar binary sources, which
are bound systems of two stellar mass objects which are
in orbit around each other. The objects can be normal
or evolved stars or compact remnants of stars which
have finished their evolution, which are white dwarfs,
neutron stars or black holes. The binary can be made
up of any combination of these objects.
It can be shown from a detailed analysis of
the gravitational wave equation that a binary system
consisting of two objects with the same mass M going
round each other in a circular orbit with period P would
emit gravitational radiation with luminosity (energy
emitted per unit time)
Inserting numerical constants, and expressing the
mass M in units of Solar mass and the orbital period
in hours, the gravitational wave luminosity can be
expressed as
A compact binary with Solar mass stars and a
period of 1h would therefore have a gravitational wave
luminosity comparable to the energy emitted by the Sun
over infrared to ultraviolet wavelengths. The luminosity
would be higher for more massive objects and smaller
separations, but would be lower for larger separations.
The emission of gravitational waves causes the
binary to lose energy, so that the total energy becomes
more negative, and the separation of the components
decreases, as discussed in BHS-7. This leads to decrease
in the orbital period at a rate given by
Introducing constants, the expression reduces to
Since the period P has the units of time, dP/dt is
a dimensionless quantity (the above expressions are
all taken from the book Gravity An Introduction to
Einsteins General Relativity by James B. Hartle.
The expressions can be generalised to unequal
mass components and orbits which have eccentricity,
i.e. they have an elliptical shape. The radiation is
emitted at a fundamental frequency which is twice
the orbital frequency, and at higher harmonics of the
fundamental, with the number of harmonics depending
on the eccentricity.
The energy loss due to the emission of gravitational
waves causes the system to shrink in size, which is
indicated by a decrease in its orbital period P, which
leads to increase in the frequency with which the two
bodies go round each other. The period changes more
rapidly as P decreases, and eventually the two objects
3
1.1 Generation of Gravitational Radiation
should collide and merge together. That is possible
when the bodies are compact objects like neutron stars or
black holes. In fact, all the gravitational wave emitting
binary systems discovered by the LIGO gravitational
wave detectors are black hole binaries, neutron stars
binaries or binary systems with one component a
neutron star and the other a black hole. All these
binaries were detected close to the merger, when the
gravitational wave luminosity is large enough to be
detected and the change in frequency and the merger
occur over just a fraction of a second. We will describe
these detections in greater detail in future stories.
The situation is more complicated for binary
systems which consist of normal or evolved stars. As
they continue to shrink in size, each star in the binary
exerts greater gravitational force on the other, which
leads to change in their shape from a nearly spherical
form. If the binary is very compact, the star with the
greater mass could fill an equipotential surface known
as a Roche lobe, which leads to mass transfer from the
Roche filling star to the other star, which changes the
nature of both stars. Matter can also leave the system.
For these effects to occur, the binary will need to be very
compact, so that there is copious gravitational wave
emission leading to contraction. But most stellar binary
systems have a separation of 1 AU or more, where AU
is the Astronomical Unit, which is equal to the Sun-
Earth distance of about 150 million km. Such a binary
for Solar mass stars would have a period of a fraction
of a year, i.e. several thousand hours. Such binaries
would have very feeble gravitational wave emission,
leading to very slow contraction. For rapid contraction
eventually leading to coalescence, we need binaries of
short periods with the binary components made up of
compact objects. We will come across such binaries in
our future stories.
Next Story: In the next story we will consider the
discovery and formation of the binary radio pulsar PSR
1913+16, the discovery of which in the early 1970s led
to the first evidence that gravitational waves exist.
About the Author
Professor Ajit Kembhavi is an emeritus
Professor at Inter University Centre for Astronomy
and Astrophysics and is also the Principal Investigator
of the Pune Knowledge Cluster. He was the former
director of Inter University Centre for Astronomy and
Astrophysics (IUCAA), Pune, and the International
Astronomical Union vice president. In collaboration
with IUCAA, he pioneered astronomy outreach
activities from the late 80s to promote astronomy
research in Indian universities.
4
X-ray Astronomy: Through Missions
by Aromal P
airis4D, Vol.3, No.3, 2025
www.airis4d.com
2.1 Satellites in 1990s
The huge success of RXTE and BeppoSAX gave
a supersonic boost to X-ray astronomy. Both satellites
influenced the upcoming missions and proved that
many things are hidden in these energy ranges. Even
though RXTE was a huge success and had great timing
capabilities, it did not have any optics that made
localization of sources difficult, and its spectra were not
that great for understanding many small-scale spectral
variations. In the latter half of the 1990s, we saw many
exciting missions that filled the inabilities of RXTE,
and today, we are going to discuss those game-changers.
2.1.1 ARGOS (USA)
The Advanced Research and Global Observations
Satellite (ARGOS), also known as the NRL-801
Experiment, was funded by the U.S. Department of
Defense Space Test Program and operated by the U.S.
Air Force. This satellite was planned as a technology
demonstrator for new detectors. It was launched in Feb
1999 into an 800 km circular sun-synchronous orbit
with a 98.7°inclination and was decommissioned in
2003. ARGOS had nine experiments among which
one was for high-energy astrophysical studies. We will
focus solely on that experiment here.
Unconventional Stellar Aspect (USA) is a low-
cost X-ray timing experiment with unprecedented
timing capability of 1
µ
s. USA was operated
for nearly 16 months starting from April
1999. It consisted of large-area gas scintillation
proportional counters that worked in a 1-15 keV
energy range. It had an effective area of 2000
cm
2
.
USA tracked bright X-ray sources without
commands from the ground stations. It provided timing
and spectral information about different Low Mass
X-ray Binaries.
2.1.2 CHANDRA
Chandra was another flagship project that
revolutionized our ideas about high-energy phenomena.
The satellite was launched in July 1999 in space shuttle
Columbia into a highly elliptical orbit of perigee 10,000
km and apogee 140,000 km with an inclination angle
of 28.5
. It revolved around the earth in 64 hours,
which gave longer observation time to a source without
interruption. Chandra X-ray Observatory was named
after Indian-American scientist Dr. Subrahmanyan
Chandrasekhar. Although initially planned for 5
years, the satellite is still in operation, giving valuable
data and redefining our understanding of the X-ray
sky. Chandra surpassed all its ancestors in angular
resolution by an order of 3 magnitude, which made
the satellite detect and study close-by X-ray emitting
sources and also helped with the precise positioning
of X-ray sources. Chandra had a High-Resolution
Mirror Assembly (HRMA), a multi-mirror array for
focusing highly energetic X-rays through the grazing
angle principle. HRMA had a diameter of 1.2 m, a
length of 80 cm, and a focal length of 10m, with two
scientific instruments located at its focal plane. Chandra
used a high-resolution spectroscopic instrument, which
showed small deviations in the spectrum.
Chandra had 4 scientific instruments:
High Energy Transmission Grating (HETG)
2.1 Satellites in 1990s
Figure 1: Chandra X-ray Observatory.
Credits: NASA.
consisted of 336 grating facets for high-resolution
spectroscopy, with a spectral resolving power of
up to 1000. It worked with ACIS spectra detectors
in an energy range of 0.4-10 keV.
Low Energy Transmission Grating (LETG)
consisted of 540 grating elements, giving it the
highest spectral resolving power among all the
instruments on Chandra. It had a resolving power
of more than 1000 in the lower energy ranges of
0.07-0.2 keV. The instrument covers an energy of
0.07-7.29 keV.
High Resolution Camera (HRC) consisted of
two microchannel plate (MCP) (for imaging and
spectroscopy) detectors. Each MCP was made
of a 10-cm square cluster of 69 million tiny lead-
oxide glass tubes that were about 10 micrometers
in diameter, which is responsible for the high
spatial resolving power of the instrument. HRC
was located in the focal plane of the HRMA. It
works in the energy range 0.08-10 keV, having a
maximum effective area of 0.277 keV: 133 cm
2
for
the imaging detectors. Imaging detectors had a
Field of View (FOV) of 16.9 × 16.9 arcmin.
Advanced CCD Imaging Spectrometer (ACIS)
consisted of 10 charge-coupled devices arranged
in two arrays, one optimized for imaging
wide fields(FOV:30x30 arcmin) and the other
optimized for readout of HETG(FOV: 6 x 90
arcmin). This was the second instrument located
at the focal plane, and this worked in 0.2-10 keV
of energy range with an effective area of 227 cm
2
at one keV for the imaging part.
Chandra revealed many secrets of the universe,
including strong evidence for dark matter by studying
colliding galaxy clusters, such as the Bullet Cluster.
It made the first X-ray detection of Sagittarius A*,
the supermassive black hole at the center of our
galaxy. Chandra also contributed to the discovery
of intermediate-mass black hole candidates and studied
possible supermassive black hole binaries. Additionally,
it detected X-ray pulsations from Jupiters poles.
Chandras Deep Field observations resolved 95% of
the cosmic X-ray background into discrete sources,
primarily supermassive black holes in galactic centers,
shedding light on their rapid growth in the early
universe. Chandra also provided evidence for a possible
extragalactic exoplanet in the Whirlpool Galaxy (M51)
using the transit method and studied space weather
phenomena around exoplanets in Alpha Centauri. After
25 years of service, Chandra continues to provide
valuable scientific data, expanding our knowledge of
the universe.
2.1.3 XMM-NEWTON
X-ray Multi-mirror Mission (XMM) - Newton
was launched by the European Space Agency (ESA) in
December 1999. It was launched into a high elliptical
orbit of 7,000 km apogee and 114,000 km perigee with
an inclination of 40
, over time its orbit was extended.
The high elliptical orbit given a orbital period of 48
hours, which provided a non-obscured observation
of X-ray sources for a long time. Initially, XMM-
Newton was planned for 10 years, and currently, it is
entering its 25th year in observing outer space, with
all the instruments still working properly. The satellite
functions in the 0.1-15 keV energy range, covering from
visible light to medium X-ray ranges. XMM-Newton
observatory features three advanced X-ray telescopes,
each consisting of a sophisticated mirror assembly. Each
mirror module comprises 58 nested Wolter I grazing-
incidence mirrors. These mirror modules focus on
the coming X-rays, and the scientific instruments in
XMM-Newton are located at the focal plane of the three
mirrors.
XMM-Newton has three scientific instruments on
board:
6
2.1 Satellites in 1990s
Figure 2: XMM-Newton Observatory Credits: ESA
The European Photon Imaging Cameras
(EPIC) - the primary instrument of XMM-
Newton, consists of two Metal Oxide
Semiconductor(MOS) CCD cameras and one pn-
CCD camera. Each of them is placed at the focus
of each incidence mirror. EPIC provides imaging
and spectroscopic data in an energy range of
0.2-12 keV. EPIC-MOS is designed for doing
studies in low energy X-ray range, providing
better imaging and spectroscopic capabilities
than pn-CCD, but the latter has higher timing
capabilities than EPIC-MOS. The FOV for both
the MOS and pn cameras in the EPIC system is
30 arcminutes. The combined effective area of
EPIC detectors is nearly 1,500 cm
2
at 1 keV.
The Reflection Grating Spectrometers (RGS)
consist of a combination of reflection gratings
and CCD detectors to achieve high-resolution
soft X-ray spectroscopy in the energy range of
0.3–2.1 keV. RGS provides a spectra resolving
power of 100-500 across the energy range, which
helps to detect different molecular transmissions
dominating soft X-rays. RGS has two working
modes: Spectral mode and Timing mode. Two
identical RGS units are located behind the second
and third mirror assembly.
The Optical/UV Monitor (OM) is 30-cm Ritchey-
Chretien telescope which will give simultaneous
UV and Optical data. It is co-aligned with the X-
ray telescope and works alongside it. It has
three optical and three UV sensors and two
grisms (grating and a prism) for low-energy
spectroscopic studies. It works in a wavelength
of 180-600 nm with a FOV of 17 arcminutes
square.
With the three scientific instruments, XMM-Newton
provided a broadband analysis ranging from the visible
range to the X-rays. XMM-Newton achieved the first
measurement of a supermassive black hole’s spin rate
in galaxy NGC 1365, revealing rapid rotation that
informs galaxy evolution models. It also detected quasi-
periodic oscillations (QPOs) from the supermassive
black hole 1ES 1927+654. XMM-Newton studied the
tidal disruption events when the matter from a star is
accreted and torn apart by a black hole. Its deep surveys
aided in mapping dark matter distribution through
galaxy cluster studies. It also discovered unusually cold
neutron stars, which challenged the existing models of
stellar evolution. XMM-Newton provided significant
information that marked a cornerstone for high-energy
astrophysics in its 25-year-long and still continuing
studies and surveys.
Satellites in the 1990s inspired the coming
generation to hope and aim for big satellites and science
goals. X-ray astronomy got a huge boost, and in the later
ages to come, it accelerated the quest for the unknown.
We will discuss the satellites that came after this in the
upcoming articles.
References
Santangelo, Andrea and Madonia, Rosalia and
Piraino, Santina A Chronological History of X-
Ray Astronomy Missions.Handbook of X-ray and
Gamma-ray Astrophysics.ISBN 9789811645440
K.S.Wood, G.Fritz, P.Hertz, W.N.Johnson,
M.P.Kowaiski, M.N.Lovellette, M.T.Wolff,
D.J.Yentis E.Bloom, L.Cominsky, K.Fairfield,
G.Godfrey, J.Hanson, A.Lee,P.Michelson,
R.Taylor and H.Wen THE USA EXPERIMENT
ON THE ARGOS SATELLITE: A LOW COST
INSTRUMENT FOR TIMING X-RAYBINARIES
SPIE Vol.2280
X-ray and Gamma-ray Missions
Chandra Science Instruments
Chandra Instruments and Calibration
7
2.1 Satellites in 1990s
Chandra Spacecraft and Instruments
Chandra X-ray Observatory
XMM-Newton Instruments
Megan Masterson, Erin Kara, Christos
Panagiotou, William N. Alston, Joheen
Chakraborty, Kevin Burdge, Claudio Ricci,
Sibasish Laha, Iair Arcavi, Riccardo Arcodia,
S. Bradley Cenko, Andrew C. Fabian, Javier
A. Garc
´
ıa, Margherita Giustini, Adam Ingram,
Peter Kosec, Michael Loewenstein, Eileen T.
Meyer, Giovanni Miniutti, Ciro Pinto, Ronald
A. Remillard, Dev R. Sadaula, Onic I. Shuvo,
Benny Trakhtenbrot, Jingyi Wang Milliher tz
Oscillations Near the Innermost Orbit of a
Supermassive Black Hole arXiv:2501.01581
X-ray Satellite XMM-Newton Celebrates 20
Years in Space
XMM-Newton factsheet
M. G¨udel A decade of X-ray astronomy with
XMM-Newton A&A 500, 595–596 (2009) DOI:
10.1051/0004-6361/200912208
About the Author
Aromal P is a research scholar in
Department of Astronomy, Astrophysics and Space
Engineering (DAASE) in Indian Institute of
Technology Indore. His research mainly focuses on
studies of Thermonuclear X-ray Bursts on Neutron star
surface and its interaction with the Accretion disk and
Corona.
8
Supernova: The Explosive Death of Stars
by Sindhu G
airis4D, Vol.3, No.3, 2025
www.airis4d.com
Figure 1: An artists interpretation of a generic
supernova. (Image Credit: Soubrette/iStock/Getty
Images Plus)
3.1 Introduction
Supernovae are among the most powerful and
fascinating events in the universe. These stellar
explosions release immense amounts of energy,
outshining entire galaxies for a short period. They play
a crucial role in the cosmic cycle of matter, dispersing
heavy elements into space, influencing star formation,
and even impacting planetary systems. This article
explores the different types of supernovae, their causes,
their role in cosmic evolution, and their significance in
astrophysics.
3.2 What is a Supernova?
A supernova occurs when a star undergoes a
catastrophic explosion, resulting in an intense burst
of energy. This explosion marks the end of a stars life
cycle and can be observed across vast cosmic distances.
Supernovae are classified into two main types: Type I
and Type II, based on their underlying mechanisms and
the presence or absence of hydrogen in their spectra.
3.3 Types of Supernovae
3.3.1 Type I Supernovae
Type I supernovae lack hydrogen lines in their
spectra and are further divided into subcategories:
Type Ia: These occur in binary star systems
where a white dwarf accretes matter from its
companion. When the white dwarf reaches
the Chandrasekhar limit ( 1.4 solar masses), it
undergoes runaway nuclear fusion, leading to a
thermonuclear explosion.
Type Ib and Ic: These result from massive stars
that have shed their outer hydrogen layers before
collapsing. Type Ib still retains helium, whereas
Type Ic lacks both hydrogen and helium.
3.3.2 Type II Supernovae
Type II supernovae retain hydrogen in their spectra
and originate from massive stars undergoing core
collapse. These stars exhaust their nuclear fuel, leading
to gravitational collapse and a subsequent explosion.
They are further classified into:
Type II-P: Characterized by a plateau in their
light curves, indicating a prolonged phase of
hydrogen recombination.
Type II-L: Display a linear decline in brightness
over time.
Type IIn: Show narrow spectral lines due to
strong interactions with surrounding material.
3.4 Causes of Supernovae
3.4 Causes of Supernovae
The mechanisms driving supernovae vary based
on the type:
1.
Thermonuclear Explosions: In Type Ia
supernovae, the explosion results from the
detonation of a white dwarf in a binary system
after exceeding its mass limit.
2.
Core Collapse: In Type II, Ib, and Ic supernovae,
a massive star undergoes gravitational collapse
when its core can no longer support the weight
of the outer layers. This collapse triggers a shock
wave that expels the star’s outer material into
space.
3.5 The Aftermath of a Supernova
The remnants of a supernova include:
Neutron Stars and Pulsars: If the core of a
collapsed star remains between 1.4 and 3 solar
masses, it forms a neutron star, an incredibly
dense object composed primarily of neutrons.
Black Holes: If the cores mass exceeds the
Tolman–Oppenheimer–Volkoff limit, it collapses
into a black hole.
Supernova Remnants: Expelled material
enriches the interstellar medium with elements
like iron, oxygen, and silicon, which contribute
to the formation of new stars and planets.
3.6
The Role of Supernovae in Cosmic
Evolution
Supernovae are not just destructive events; they are
crucial to the life cycle of the universe. These explosions
distribute heavy elements essential for the formation
of planets and biological life. Without supernovae,
elements like carbon, oxygen, and iron would not exist
in abundance, making planetary formation and life
as we know it impossible. Supernovae also regulate
star formation by injecting energy into the interstellar
medium, creating shock waves that compress gas clouds,
leading to the birth of new stars.
3.7 Importance of Supernovae in
Astronomy
Supernovae play a critical role in shaping the
cosmos:
Element Formation: Heavy elements such
as gold, silver, and uranium are produced
in supernova nucleosynthesis and distributed
throughout the universe.
Cosmic Distance Indicators: Type Ia
supernovae serve as standard candles for
measuring astronomical distances, aiding in our
understanding of the universes expansion.
Triggering Star Formation: Shock waves
from supernovae can compress interstellar clouds,
leading to the birth of new stars.
Influencing Planetary Systems: The chemical
enrichment from supernovae affects planetary
composition and the potential for life.
3.8 Observing Supernovae
Astronomers detect supernovae using telescopes
equipped with optical, X-ray, and radio instruments.
Observatories like the Hubble Space Telescope, the
Chandra X-ray Observatory, and ground-based facilities
such as the Keck Observatory monitor these explosive
events. Citizen science projects and automated surveys,
like the All-Sky Automated Survey for Supernovae
(ASAS-SN), contribute significantly to supernova
research.
3.9 Notable Supernovae in History
Several supernovae have been observed throughout
history, contributing to our understanding of these
cosmic events:
SN 1054: Observed by Chinese and Middle
Eastern astronomers, it led to the formation of
the Crab Nebula.
SN 1572 (Tycho’s Supernova): Studied by Tycho
Brahe, this event provided evidence against the
Aristotelian belief in an unchanging celestial
sphere.
10
3.10 Future Research and Supernova Detection
SN 1604 (Kepler’s Supernova): Documented
by Johannes Kepler, it was the last supernova
observed in the Milky Way.
SN 1987A: One of the closest and most studied
supernovae, providing insights into core-collapse
mechanisms.
3.10 Future Research and Supernova
Detection
With advancements in technology, astronomers
continue to discover and analyze supernovae. The Vera
C. Rubin Observatory is expected to detect thousands
of supernovae per year, improving our understanding
of stellar evolution, dark energy, and the structure
of the universe. Future space missions will further
enhance our ability to observe and interpret these stellar
explosions.
3.11 Conclusion
Supernovae are essential to our understanding of
stellar evolution, cosmic chemistry, and the universe’s
expansion. These explosions not only mark the end
of massive stars but also seed the universe with the
building blocks necessary for planets and life. Ongoing
studies continue to uncover new insights into the physics
of these stellar cataclysms, helping us better understand
the cosmos.
References:
Supernova
A Brief Review of Historical Supernovae
History of supernova observation
Supernova explosions and historical chronology
What is a supernova?
About the Author
Sindhu G is a research scholar in Physics
doing research in Astronomy & Astrophysics. Her
research mainly focuses on classification of variable
stars using different machine learning algorithms. She
is also doing the period prediction of different types
of variable stars, especially eclipsing binaries and on
the study of optical counterparts of X-ray binaries.
11
Part II
Biosciences
Synthetic Biology: A Revolutionary Scientific
Frontier - Part 2
Recent Advancements in Synthetic Biology
by Geetha Paul
airis4D, Vol.3, No.3, 2025
www.airis4d.com
1.1 Introduction
Synthetic biology is no longer a futuristic
dream; its a rapidly evolving reality. Recent
advancements transform our ability to design and build
biological systems with unprecedented precision. From
engineering microbes to produce life-saving drugs to
creating sustainable alternatives to fossil fuels, the
potential of synthetic biology seems limitless. In the
ever-evolving landscape of science and technology,
synthetic biology stands out as a revolutionary field
that merges biology, engineering, and computational
design to reshape the boundaries of what is possible.
By reprogramming the genetic code of organisms,
scientists can now create novel biological systems
or redesign existing ones to address some of the
most pressing challenges facing humanity. From
sustainable energy production and environmental
remediation to groundbreaking medical therapies and
bio-manufacturing, synthetic biology is paving the
way for a future where biology is as programmable
as software.
Recent advancements in this field have accelerated
its potential, driven by breakthroughs in gene editing
technologies like CRISPR, the development of artificial
intelligence-driven design tools, and the synthesis
of increasingly complex genetic circuits. These
innovations are not only expanding our understanding
of life. Still, they are also enabling the creation of
organisms with entirely new functions, opening doors
to applications that were once the realm of science
fiction.
This article explores the latest breakthroughs in
synthetic biology, highlighting how these advancements
are transforming industries, addressing global
challenges, and raising important ethical and societal
questions. As we stand on the brink of a new era
in biological engineering, the implications of these
discoveries are profound, promising to reshape our
world in ways we are only beginning to imagine.
1.2 A Brief Overview
1.2.1 CRISPR-Cas9 EnhancementsCRISPR
Cas9 is a groundbreaking tool in synthetic biology
that allows scientists to edit genes with high precision
with fewer unwanted side effects and can do more things.
It has become more accurate and versatile. Base editing
and prime editing are new methods that allow scientists
to make tiny, specific changes to DNA without making
big cuts, resulting in more precise gene modifications.
CRISPR-Cas variants like Cas12 and Cas13 are also
being explored for RNA editing and diagnostics.
1.2 A Brief Overview
Figure 1: The figure illustrates the Cas9 enzyme
(blue) generates breaks in double-stranded DNA by
using its two catalytic centers (blades) to cleave each
strand of a DNA target site (gold) next to a PAM
sequence (red) and matching the 20-nucleotide sequence
(orange) of the single guide RNA (sgRNA). The sgRNA
includes a dual-RNA sequence derived from CRISPR
RNA (light green) and a separate transcript (tracrRNA,
dark green) that binds and stabilizes the Cas9 protein.
Cas9-sgRNA–mediated DNA cleavage produces a blunt
double-stranded break that triggers repair enzymes to
disrupt or replace DNA sequences at or near the cleavage
site. Catalytically inactive forms of Cas9 can also be
used for programmable regulation of transcription and
visualization of genomic loci.
Image Courtesy:
www.ndsu.edu/pubweb/
mcclean/ctig/ctigfall2016/d
oudna-and-charpentier-the-new-frontier-of-genom
e-engineering-with-CRISPR-Cas9.pdf
1.2.2 DNA Synthesis Innovations
Advances in DNA synthesis technologies,
such as enzymatic DNA synthesis and chip-based
oligonucleotide synthesis, have significantly reduced
costs and increased the speed of gene assembly.
Companies like Twist Bioscience and DNA Script are
pioneering these methods, enabling the synthesis of
longer and more complex DNA sequences.
Figure 2: The synthetic biology test cycle. (From
the top, clockwise) Synthetic DNA constructs are
designed and manipulated using computer-aided design
software. The designed DNA is then divided into
synthesisable pieces (synthons) up to 1–1.5 kbp. The
synthons are then broken into overlapping single-
stranded oligonucleotide sequences and chemically
synthesised. The oligonucleotides are then assembled
together into the designed synthons using gene synthesis
techniques. Multiple synthons can be assembled into
larger DNA assemblies or devices if necessary. The
assembled DNAs are then typically cloned into an
expression vector and sequence-verified. Once verified,
the synthetic constructs are transformed into a cell
and the function of the synthetic construct is assayed.
Depending on the results, the constructs can then be
modified or refined, and the test cycle is repeated until
a DNA construct is obtained that produces the desired
function.
Image Courtesy:
https://pmc.ncbi.nlm.nih.gov/articles/PMC5204324/
1.2.3 Synthetic Cells and Organs
Researchers are making strides in creating
synthetic cells and organs by combining synthetic
biology with tissue engineering. For example,
artificial cells capable of performing essential metabolic
functions and synthetic organoids for drug testing are
being developed. These advancements hold promise
for regenerative medicine and personalised healthcare.
14
1.2 A Brief Overview
Figure 3: The figure illustrates the modular approach
for building synthetic cells with cell-like properties.
The integration of functional modules creates synthetic
cells with increasing complexity.
Image Courtesy:
https://pmc.ncbi.nlm.nih.gov/articles/PMC9314110/
1.2.4 Synthetic Microbes
Engineered microbes are being designed for
applications ranging from biofuel production to disease
treatment. Recent work includes the development
of synthetic bacteria that can detect and destroy
pathogens or produce therapeutic compounds in the
gut microbiome.
Figure 4: The figure shows future food challenges
which give rise to opportunities for synthetic biology.
(A) Synthetic biology tries to implement engineering
principles into life. The lightbulb highlights some of
the challenges for future foods. These challenges may
be inspirational for experimental designs for synthetic
biology methodology with the potential to improve
a process or overcome related problems. Microbes
can be altered through the “Design-Build-Test-Learn”
cycle for a greater aim and particular microbes from
traditional fermentation processes have the potential to
address future food challenges. (B) Workflow showing
a pipeline to domesticate microbes, for example from
traditional fermentations processes.
Image Courtesy:
https://pmc.ncbi.nlm.nih.gov/articles/PMC9523148/
The initial source can be analyzed by traditional
isolation of individual microbes or by metagenomics
approaches to initially get an overview of the community
before individuals are isolated. The isolated microbes
need to be identified and characterized. Once the
organism is known, one can start to make the organism
accessible for synthetic biology approaches. Therefore,
initial genetic engineering methods need to be
established (i.e., transformation procedures), followed
by advances in engineering tools (i.e., CRISPR/Cas-
based methods) and the generation of modular tool-
boxes for quick and reliable engineering of the organism.
The established tools allow microbe domestication, for
example by removal or addition of genes, for easier
handling. Subsequently the domesticated microbe can
15
1.2 A Brief Overview
be used for intensive engineering towards a desired goal
for example the assimilation of a sustainable feedstock.
1.2.5 Machine Learning and Data Science
Integration
Machine learning revolutionises synthetic biology
by enabling gene expression, protein folding, and
prediction of metabolic pathways. Tools like AlphaFold
have accelerated protein design, while AI-driven
platforms optimise genetic circuits and metabolic
engineering.
Figure 5: Shows the Machine learning and data science
integration. a, The performance of AlphaFold on the
CASP14 dataset (n = 87 protein domains) relative to
the top-15 entries (out of 146 entries), group numbers
correspond to the numbers assigned to entrants by CASP.
Data are median and the 95% confidence interval of
the median, estimated from 10,000 bootstrap samples.
b, Prediction of CASP14 target T1049 (PDB 6Y4F,
blue) compared with the true (experimental) structure
(green). Four residues in the C terminus of the crystal
structure are B-factor outliers and are not depicted.
c, CASP14 target T1056 (PDB 6YJ1). An example
of a well-predicted zinc-binding site (AlphaFold has
accurate side chains even though it does not explicitly
predict the zinc ion). d, CASP target T1044 (PDB
6VR4)—a 2,180-residue single chain—was predicted
with correct domain packing (the prediction was made
after CASP using AlphaFold without intervention). e,
Model architecture. Arrows show the information flow
among the various components described in this paper.
Array shapes are shown in parentheses with s, number
of sequences (Nseq in the main text); r, number of
residues (Nres in the main text); c, number of channels.
Image Courtesy:
https://www.nature.com/articles/s41586-021-03819-2
The combination of datasets allowed AlphaFold
to learn the complex relationships between amino
acid sequences and protein structures, leading to
its groundbreaking performance in protein structure
prediction.
1.2.6 Development of Synthetic Vaccines
Synthetic vaccines are vaccines designed and
constructed using synthetic biology and chemical
synthesis techniques. Unlike traditional vaccines
that use weakened or inactivated pathogens, synthetic
vaccines are created from synthesised components, such
as mimicking parts of the pathogen, encoding specific
antigens, assembled from synthetic proteins,combining
synthetic oligosaccharides with carrier proteins.
Synthetic biology has been pivotal in vaccine
development, particularly during the COVID-19
pandemic. mRNA vaccines, such as those developed
by Moderna and Pfizer-BioNTech, are a prime example
of synthetic biology. Researchers are now exploring
synthetic vaccines for other diseases, including cancer
and HIV.
Figure 6: A brief illustration on current chemical and
synthetic biology approaches for developing cancer
vaccines.
ImageCourtesy:
/
https://pmc.ncbi.nlm.nih.gov/article
s/PMC9611187/DCcell-Dentritecell.
1.2.7 Sustainable Materials Production
Synthetic biology enables the production of
sustainable materials, such as biodegradable plastics,
bio-based textiles, and lab-grown leather. Engineered
microorganisms convert renewable feedstocks into eco-
friendly materials, reducing reliance on fossil fuels.
1.2.8 Bioremediation Techniques
Engineered microbes and plants are being
developed to clean up environmental pollutants, such
16
1.3 Conclusion
as oil spills, heavy metals, and plastic waste. Recent
advancements include the creation of synthetic bacteria
that can degrade polyethylene terephthalate (PET)
plastics and detoxify contaminated soil.
Figure 7: A proof of concept for engineering
environmental microbiomes to rapidly degrade PET
plastics.
Image Courtesy:
https://pmc.ncbi.nlm.nih.gov/articles/PMC11420662/
Figure 8: Conjugation of pFAST-PETase-cis into
wastewater bacteria. (A) Schematic map of pFAST-
PETase-cis (not to scale): oriT, RK2/RP4 conjugative
origin of transfer; FAST-PETase, gene for FAST-PETase
enzyme; AmpR, ampicillin resistance gene (bla-TEM
1 ); mCherry, gene for fluorescent protein mCherry;
conjugation genes, encoding the IncP RK2/RP4
conjugation system; GmR, gentamycin resistance gene;
pBBR1 oriV, plasmid origin of replication. (B) FAST-
PETase coding region (not to scale) highlighting the
arabinose-inducible promoter (PBAD), signal peptide
(SPstu) and 6× His-tag. (C) Experimental procedure for
conjugation of pFAST-PETase-cis into bacteria from
a wastewater sample. (D) Experimental procedure
for measuring conjugation efficiency in wastewater
suspension.
Image Courtesy:
https://pmc.ncbi.nlm.nih.gov/articles/PMC11420662/
1.3 Conclusion
These advancements highlight the transformative
potential of synthetic biology across diverse fields,
offering innovative solutions to global challenges
while raising important ethical and regulatory
considerations.The integration of machine learning
and data science is further accelerating progress,
enabling the design of complex biological systems
with unprecedented accuracy and efficiency.
17
1.3 Conclusion
However, as synthetic biology continues to push
the boundaries of what is possible, it also raises
important ethical, social, and regulatory questions. The
ability to engineer life at the molecular level demands
careful consideration of biosafety, biosecurity, and the
equitable distribution of benefits. Collaborative efforts
among scientists, policymakers, and the public will be
essential to ensure that these powerful technologies are
used responsibly and for the greater good.
As we look to the future, synthetic biology holds
the promise of transforming our world in ways that were
once unimaginable. By harnessing the power of biology,
we are not only gaining a deeper understanding of life but
also unlocking innovative solutions to global challenges,
paving the way for a healthier, more sustainable, and
resilient future. The journey ahead is as exciting as
it is complex, and the potential for positive impact is
immense.
References
Doudna, J. A., & Charpentier, E. (2014). Genome
editing. The new frontier of genome engineering with
CRISPR-Cas9. Science (New York, N.Y.), 346(6213),
1258096. https://doi.org/10.1126/science.1258096
Hughes, R. A., & Ellington, A. D. (2017).
Synthetic DNA Synthesis and Assembly: Putting
the Synthetic in Synthetic Biology. Cold Spring
Harbor perspectives in biology, 9(1), a023812.
https://doi.org/10.1101/cshperspect.a023812
Guindani, C., da Silva, L. C., Cao, S.,
Ivanov, T., & Landfester, K. (2022). Synthetic
Cells: From Simple Bio-Inspired Modules to
Sophisticated Integrated Systems. Angewandte Chemie
(International ed. in English), 61(16), e202110855.
https://doi.org/10.1002/anie.202110855
Hwang, I. Y., Koh, E., Wong, A., March, J. C.,
Bentley, W. E., Lee, Y. S., & Chang, M. W. (2017).
Engineered probiotic Escherichia coli can eliminate
and prevent Pseudomonas aeruginosa gut infection in
animal models. Nature communications, 8, 15028.
https://doi.org/10.1038/ncomms15028
Jumper, J., Evans, R., Pritzel, A. et
al. Highly accurate protein structure prediction
with AlphaFold. Nature 596, 583–589 (2021).
https://doi.org/10.1038/s41586-021-03819-2
Pardi, N., Hogan, M., Porter, F. et al.
mRNA vaccines a new era in vaccinology.
Nat Rev Drug Discov 17, 261–279 (2018).
https://doi.org/10.1038/nrd.2017.243
Keasling J. D. (2010). Manufacturing
molecules through metabolic engineering. Science
(New York, N.Y.), 330(6009), 1355–1358.
https://doi.org/10.1126/science.1193990
Yoshida, S., Hiraga, K., Takehana, T., Taniguchi,
I., Yamaji, H., Maeda, Y., Toyohara, K., Miyamoto,
K., Kimura, Y., & Oda, K. (2016). A bacterium that
degrades and assimilates poly(ethylene terephthalate).
Science (New York, N.Y.), 351(6278), 1196–1199.
https://doi.org/10.1126/science.aad6359
https://pmc.ncbi.nlm.nih.gov/articles/PMC5204324/
https://pmc.ncbi.nlm.nih.gov/articles/PMC9314110/
https://www.nature.com/articles/s41586-021-
03819-2
https://pmc.ncbi.nlm.nih.gov/articles/PMC9611187/
https://pmc.ncbi.nlm.nih.gov/articles/PMC11420662/
About the Author
Geetha Paul is one of the directors of
airis4D. She leads the Biosciences Division. Her
research interests extends from Cell & Molecular
Biology to Environmental Sciences, Odonatology, and
Aquatic Biology.
18
Part III
Computer Programming
Principal Component Analysis
by Linn Abraham
airis4D, Vol.3, No.3, 2025
www.airis4d.com
1.1 Introduction
Principal Component Analysis is a useful tool
in machine learning for doing feature extraction,
dimensionality reduction and data visualization. The
key idea here is to identify a lower number of features
or dimensions from your actual data. Apart from just
bringing down the size of the data that the machine
learning technique needs to deal with, this also helps
make your results better by removing possible noise in
your data. Remember that the PCA is a linear technique
since the PCA axes are basically some rotated version of
your original axes. The idea of the principal component
is to find a direction in your original feature space
where the variation in the data is the maximum. Once
this is done, the second principal component is the
direction which is orthogonal to the first and the one
which accounts for the maximum variation in the rest of
the data. This is repeated untill all of the data variation
is accounted for. Let us try to understand this technique
by going through the math.
1.2 Covariance
Variance of a set of numbers is a measure of how
spread out the numbers are. It is computed as the sum
of squared distances between each value and the mean
or expected value.
var({x
i
}) = E(({x
i
} µ)
2
)
If we have a set of random variables, then for each
variable we end up with a set of numbers corresponding
to how many the variable was measured. Then
computing the variance of each random variable is
pretty straight forward using this definition. However
when it comes to covariance, we exetend this idea to
two random variables simultaneously. Here we define
a covariance for two random variables using a similar
distance measure but instead of single squared term we
have a cross term containing the two variables. The
covariance of two variables, so defined, measures how
dependent the two variables are (in the statistical sense).
When we have a data matrix of order
n × m
corresponding to n samples each with m features, by
taking different combinations of two variables we end
up with a square symmetric matrix of order
m × m
which is called the covariance matrix. The variances
then appear as the diagonal terms of this covariance
matrix.
cov({x
i
}, {y
i
}) = E({x
i
} µ)E({y
i
} ν)
where
µ
is the mean of set
{x
i
}
and
ν
is the mean of
set {y
i
} and E is the expectation operator.
1.3 Eigenvectors
Let us referesh our memory of what eigenvectors
and eigenvalues are. For a square symmetric matrix A,
if there exists a vector V which when operated on by
the matrix results in a scaled version of the vector, then
such a vector is called an eigenvector and the value by
which it is scaled is known as the eigenvalue.
1.4 Principal Components
1.4 Principal Components
The objective of doing PCA is to find a new set
of basis vectors for representing the data such that in
this new space, the covariance matrix corresponding to
the transformed data matrix becomes a diagonal matrix.
From the spectral theorem we have the following result
for a square symmetric matrix,
E
T
A E = D (1.1)
where D is a diagonal matrix made out of the
eigenvalues and E is an orthogonal matrix whose
columns contain the normalized eigenvectors of A.
Now for any transformation matrix P which acts on the
original data matrix X, we have the following result.
cov(P
T
X) = P
T
cov(X)P (1.2)
Since we require P to be such that
cov(P
T
X)
be
a diagonal matrix, it is clear by comparing the two
previous equations that P should be the matrix made
out of the eigenvectors of the covariance matrix of X.
Since the left hand term in equation 1.1 and the right
hand term in equation 1.2 both represents a change of
basis operation, the interpretation of doing
P
T
X
is that
we are going from the original space to another space
which is defined by the eigenvectors of the covariance
matrix (in the original space).
Since the eigenvalues corresponding to the
eigenvectors are the diagonal elements of your new
covariance matrix, they also represent the data variances
along the eigenvector directions. The eigenvector
corresponding to the largest eigenvalue becomes your
first principal component and so on until you get to the
eigenvector with the least eigenvalue which becomes
your last principal component. Here if you see that a
combination of k eigenvectors is enough to pretty much
explain the bulk of the variance in your data set, you
can choose to drop the rest. The original data can be
reconstructed by taking the resultant projections of the
k components on the original axes.
1.5 The PCA algorithm
So now it is time to summarize the PCA algorithm
which is quite simple to implement using Numpy.
Write N datapoints
x
i
= (x
1i
, x
2i
, ..., x
M i
)
as
row vectors.
Put these vectors into a matrix X (which will have
size N × M ).
Centre the data by subtracting off the mean of
each column, putting it into matrix B.
Compute the covariance matrix C =
1
N
B
T
B.
Compute the eigenvalues and eigenvectors of C,
so
V
1
CV = D
, where V holds the eigenvectors
of C and D is the M x M diagonal eigenvalue
matrix.
Sort the columns of D into order of decreasing
eigenvalues, and apply the same order to the
columns of V.
Reject those with eigenvalue less than some
η
,
leaving K dimensions in the data.
References
[Chatfield and Collins(1980)]
Christopher Chatfield
and Alexander J. Collins. Introduction to
Multivariate Analysis. Springer US, Boston, MA,
1980. ISBN 978-0-412-16030-1 978-1-4899-3184-
9. doi: 10.1007/978-1-4899-3184-9.
[Marsland(2014)]
Stephen Marsland. Machine
Learning: An Algorithmic Perspective. Chapman
and Hall/CRC, 2 edition, October 2014. ISBN
978-0-429-10250-9. doi: 10.1201/b17476.
[Halmos(1974)]
P. R. Halmos. Finite-Dimensional
Vector Spaces. Undergraduate Texts in
Mathematics. Springer-Verlag, 1974. ISBN 0-387-
90093-4 978-0-387-90093-3.
21
REFERENCES
About the Author
Linn Abraham is a researcher in Physics,
specializing in A.I. applications to astronomy. He is
currently involved in the development of CNN based
Computer Vision tools for prediction of solar flares
from images of the Sun, morphological classifications
of galaxies from optical images surveys and radio
galaxy source extraction from radio observations.
22
An Introduction to Parallel Computing
by Ajay Vibhute
airis4D, Vol.3, No.3, 2025
www.airis4d.com
2.1 Introduction
The history of computing is an interesting journey
that started in 1830 with Charles Babbage’s innovative
design of the Analytical Engine, the first conceptual
mechanical computer. Over the next century, computing
technology advanced rapidly, leading to the birth of
digital computers in the late 1930s. Among these,
the Atanasoff-Berry Computer (ABC) was one of the
first to have an electronic Arithmetic and Logic Unit.
However, it was the ENIAC (Electronic Numerical
Integrator and Computer), completed in 1945, that truly
revolutionized the field, becoming the first general-
purpose, programmable digital computer. The 1950s
marked a major milestone with the advent of the
UNIVAC (Universal Automatic Computer), the first
commercially successful computer, followed by the
TRADIC (Transistor Digital Computer) in 1955, which
introduced transistors into the world of computing.
These innovations laid the foundation for the personal
computing revolution of the 1970s and 1980s.
In the early era of personal computing, single-
core processors such as the Intel 4004 and 8080
were primarily used. These processors could handle
only one instruction at a time, limiting their tasks
to basic, sequential operations. At that time,
parallel computing on personal computers was nearly
impossible due to the lack of adequate hardware,
software, and operating system support. However, the
development of multi-core processors, multi-processor
systems, and improvements in software multitasking
have dramatically changed the computing landscape.
Today, parallel computing is not just feasible but
crucial, enabling modern systems to perform multiple
tasks simultaneously and achieve remarkable gains in
efficiency and performance.
This article examines the programming models
that allow us to fully harness the capabilities of modern
hardware. Along the way, we’ll uncover the tools
and techniques that continue to drive innovation in the
ever-evolving field of computing.
2.2 Memory Architectures
Memory architecture is one of the key components
of computer design, defining how data is stored
and accessed. Memory architectures play a vital
role in parallel computing, and understanding these
architectures will help optimize system performance
and underlying resource utilization. The memory
architectures are broadly classified into three categories:
shared memory, distributed memory.
2.2.1 Shared Memory Architecture
Shared memory architecture is a commonly used
memory model in parallel computing, where multiple
processes access the same pool of memory. In such
cases, all processes are tied to a single memory unit,
allowing them to read and write to the same memory
pool. Shared access enables data exchange between
multiple processes in a simple form and eliminates the
need for data transfer over a network. However, as all
processes can access the same memory locations, this
leads to a need for efficient synchronization to prevent
data conflicts. Synchronization can be achieved through
the use of locks, semaphores, and barriers before
2.3 Summary
writing to memory locations that are used concurrently.
Although the shared memory model is the simplest
approach to achieve parallelism, it has limitations in
scalability. As the number of processes increases,
contention for memory access can create bottlenecks
and slow down the system.
The shared memory architecture can further be
divided into Uniform Memory Access (UMA) and
Non-Uniform Memory Access (NUMA). In UMA,
all processors share the same memory pool, and
each process has equal access time to each memory
location, figure 1. While UMA is the simplest model
for facilitating synchronization, it also has limited
scalability.
Figure 1: Uniform Memory Access Architecture
In contrast, in NUMA, each processor is attached to
a local memory, and processors can access the memory
of other processors figure 2. This allows faster access
times to local memory compared to remote memory,
which ultimately reduces memory access contention and
helps to scale the system. However, scalability comes
at the cost of increased complexity in maintaining
memory consistency. Shared memory parallelism can
Figure 2: Non Uniform Memory Access Architecture
be achieved using POSIX Threads, which provide a set
of APIs to create and manage threads. Alternatively,
one can use OpenMP, which provides several compiler
directives and library methods that mainly concentrate
on shared memory parallelism.
2.2.2 Distributed Memory Architecture
In distributed memory architecture, each processor
in a system owns its dedicated local memory which it
can access without involving other processors, figure 3.
Unlike shared memory systems, where all processors
access a shared memory pool, distributed memory
systems rely on explicit communication between
processes to exchange data over the network. In this
setup, each process functions independently, using its
private memory, which leads to faster memory access
and better scalability as the system scales just by adding
new processors and associated memory unit without
altering the existing memory structure. Distributed
Figure 3: Distributed Memory Architecture
memory systems can further be classified into compute
clusters, grid computing, and cloud computing. A
compute cluster consists of several independent nodes,
where each node has its own memory. These nodes
are connected using a high-bandwidth, low-latency
network, such as InfiniBand, to communicate with
each other using Message Passing Interface (MPI) and
exchange data. Grid computing systems primarily
contain geographically distributed resources, mostly
using private networks for communication. Each
resource in the grid is an independent system. In
a cloud computing environment, distributed memory
architecture is used to provide on-demand resources.
Whenever there is a resource demand, compute nodes
are added to the resource pool and remain available
until the demand is fulfilled.
2.3 Summary
In summary, both shared memory and distributed
memory architectures are important in the parallel
computing domain, each with unique strengths and
challenges. Shared memory architecture simplifies
communication between processes by providing a single,
24
2.3 Summary
unified memory pool accessible to all processors. It
also simplifies synchronization and allows for faster
data exchange within a single system. However,
it has limitations in scalability as the number of
processors grows, leading to memory contention.
Robust synchronization mechanisms are essential to
avoid memory conflicts and ensure that each memory
location is accessed by only one process. On the
other hand, distributed memory architecture improves
scalability and fault tolerance by giving each processor
its own memory. This design allows systems to scale
more effectively and manage large-scale tasks. However,
it relies on explicit communication between processors,
which increases programming complexity and can
introduce communication overhead. Efficient message-
passing protocols, such as MPI, are critical for enabling
seamless data exchange in distributed environments.
The choice between shared memory and distributed
memory ultimately depends on the needs of the
application, the required scalability, performance, and
available resources. For many high-performance and
large-scale systems, a hybrid approach that combines
both architectures is often implemented to leverage
the strengths of each architecture, allowing optimal
resource utilization and system efficiency.
About the Author
Dr. Ajay Vibhute is currently working
at the National Radio Astronomy Observatory in
the USA. His research interests mainly involve
astronomical imaging techniques, transient detection,
machine learning, and computing using heterogeneous,
accelerated computer architectures.
25
Understanding Cosine Similarity: A
Mathematical Perspective
by Jinsu Ann Mathew
airis4D, Vol.3, No.3, 2025
www.airis4d.com
Mathematical concepts have a remarkable way
of transcending disciplines, seamlessly connecting
seemingly unrelated fields. One such concept is
cosine similarity—a tool that has proven invaluable
in modern data science, yet its foundations trace back
to fundamental principles that have been explored
for centuries. Whether analyzing language patterns,
building recommendation systems, or detecting
anomalies, cosine similarity offers an elegant approach
to measuring relationships in high-dimensional spaces.
At its core, cosine similarity is a measure of
how two entities align, disregarding differences in
scale and focusing purely on their direction. This
makes it particularly powerful in applications
where relationships matter more than raw
magnitudes—whether comparing documents,
identifying similar user preferences, or clustering
complex datasets.
3.1
Mathematical Formula for Cosine
Similarity
Cosine similarity is a fundamental concept in
data science and machine learning, used to measure the
similarity between two vectors based on the cosine of the
angle between them. Unlike traditional distance-based
metrics, cosine similarity focuses on the orientation of
vectors rather than their magnitude. It ranges from -1
to 1, where 1 indicates that the vectors are identical
in direction, 0 means they are completely unrelated
(orthogonal), and -1 signifies that the vectors point in
opposite directions. This characteristic makes cosine
similarity particularly useful in applications where the
relative direction of data points matters more than their
absolute values.
Mathematically, cosine similarity is calculated
using the dot product of two vectors divided by the
product of their magnitudes. The formula is expressed
as:
cos(θ) =
A∥∥B
A B
(3.1)
In this equation,
A B
represents the dot product of the
vectors, which sums the element-wise multiplication of
their components. The terms
AandB
denote the Euclidean magnitudes (or norms) of the
vectors, ensuring that the similarity measure remains
independent of vector scale. By normalizing the dot
product with the magnitudes, cosine similarity captures
the degree of alignment between two vectors while
disregarding their absolute lengths.
This property is particularly significant in high-
dimensional vector spaces, where traditional distance
measures like Euclidean distance may become less
meaningful. In fields such as natural language
processing (NLP) and information retrieval, text
documents and words are often represented as vectors
in a multi-dimensional space.
3.2 Geometric Interpretation
3.2 Geometric Interpretation
The geometric interpretation of cosine similarity
further highlights its significance. By measuring
the angle between vectors rather than their absolute
difference, it provides a more intuitive understanding
of relationships in vector space. When two vectors
are closely aligned, the cosine value approaches 1,
indicating high similarity. As the angle increases, the
cosine value moves toward 0, signifying lower similarity.
If the vectors point in opposite directions, the similarity
becomes -1, meaning complete dissimilarity.
This concept is deeply connected to how vectors
are used in physics. Imagine two forces acting on an
object—when they are applied in the same direction,
their effects reinforce each other, just as a cosine
similarity of 1 indicates identical vectors. If they are
perpendicular, they have no direct influence on each
other, similar to a cosine similarity of 0. When forces
act in opposite directions, they cancel each other out,
resembling a cosine similarity of -1. Just as physicists
use angles between vectors to determine resultant forces
or interactions, data scientists use cosine similarity to
quantify relationships in high-dimensional spaces.
3.3 Cosine Similarity in Data Science
Cosine similarity is widely used in data science
for measuring the similarity between data points
represented as vectors. This metric is particularly
valuable in high-dimensional spaces, where traditional
distance-based measures like Euclidean distance may
not be effective. Cosine similarity focuses solely on
the direction of the vectors rather than their magnitude,
making it ideal for applications such as text analysis,
recommendation systems, and clustering algorithms.
One of the most common use cases of cosine
similarity is in natural language processing (NLP),
where documents or sentences are converted into vector
representations using techniques like TF-IDF (Term
Frequency-Inverse Document Frequency) or word
embeddings (e.g., Word2Vec, BERT). The similarity
between two documents can then be assessed by
computing the cosine of the angle between their
(Image courtesy:https://ai.plainenglish.io/understanding-ai-similarity-search-8548912203a6)
Figure 1: Vector Similarity
respective vectors.To better understand how cosine
similarity functions in document comparison, consider
three key cases Figure( 1):
1) Similar Documents (Cosine Similarity = 1)
When two document vectors are nearly aligned, the
cosine similarity is close to 1, indicating high similarity.
This means the documents share many common words
or have similar thematic content. In applications
like plagiarism detection, a high cosine similarity
suggests that two documents are highly alike, potentially
containing overlapping information. Similarly, in search
engines, documents with high similarity to a query are
ranked higher in search results.
2) Unrelated (Orthogonal) Documents (Cosine
Similarity = 0)
When two vectors are perpendicular, their cosine
similarity is 0, indicating no meaningful relationship
between them. In text analysis, this suggests that the
documents have little to no common terms or concepts.
For example, an article about ”machine learning” and
another about ”classical music” might have a near-zero
cosine similarity, as they contain distinct vocabularies
with minimal overlap. In recommendation systems,
two users with completely different preferences would
have orthogonal vectors, meaning their behaviors do
not influence each others recommendations.
3) Opposite Documents (Cosine Similarity = -1)
When two document vectors point in exactly
opposite directions (180-degree angle), their cosine
similarity is -1, indicating complete dissimilarity or
even contradiction. This scenario is uncommon in
general text analysis but can be useful in sentiment
classification. For instance, a review with extremely
positive language (e.g., ”excellent, outstanding, highly
recommend”) would have a cosine similarity close to
-1 when compared to a review with extremely negative
27
3.4 Understanding Cosine Similarity Through a Practical Example
language (e.g., ”terrible, worst, do not buy”). This
distinction helps in identifying conflicting opinions and
polarizing viewpoints in data.
Cosine similarity also plays a crucial role in
recommendation systems. In platforms like Netflix or
Amazon, users preferences are represented as vectors,
where each dimension corresponds to a product or
movie rating. By calculating cosine similarity between
users rating vectors, the system can suggest items
based on the preferences of similar users. If two users
have a high similarity score, recommendations can be
made by suggesting items liked by one user to the other.
This method enhances personalized recommendations
without needing direct overlap in purchases or views.
In computer vision, cosine similarity is used in
image recognition and classification. Deep learning
models, such as FaceNet, encode images into feature
vectors in high-dimensional space. When identifying
a face, the model compares the vector representation
of the new image with stored representations. A high
cosine similarity score indicates that the two images
likely belong to the same person. This technique is
widely used in security applications, such as facial
recognition systems for unlocking smartphones or
verifying identity at airports.
Another major application of cosine similarity
is in clustering and anomaly detection. In clustering
algorithms like K-means, cosine similarity helps group
data points that are directionally similar. This is
particularly useful in customer segmentation, where
businesses categorize customers based on shopping
patterns. In fraud detection, cosine similarity helps
detect unusual transactions. Regular transactions
form clusters with high similarity, while fraudulent
transactions appear as outliers with low similarity
scores.
3.4 Understanding Cosine Similarity
Through a Practical Example
To grasp the concept of cosine similarity in a real-
world setting, lets consider an example in text analysis.
Imagine we have three short documents:
Document A: ”Machine learning is a branch of
artificial intelligence.”
Document B: ”Deep learning is a subfield of
machine learning.”
Document C: ”Classical music compositions
follow structured patterns.”
We want to determine how similar these documents
are to each other using cosine similarity.
Converting Text into Vectors
Since computers do not process raw text, we first
convert each document into a vector representation.
One common method is the Bag of Words (BoW) or TF-
IDF (Term Frequency-Inverse Document Frequency)
model, where words are treated as features in a vector
space.
For simplicity, lets assume we build a feature
space using the key terms: [”machine”, ”learning”,
”artificial”, ”intelligence”, ”deep”, ”subfield”, ”music”,
”compositions”, ”structured”, ”patterns”].
Now, each document can be represented as a vector
based on the presence of these words:
Vector A: [1, 1, 1, 1, 0, 0, 0, 0, 0, 0]
Vector B: [1, 1, 0, 0, 1, 1, 0, 0, 0, 0]
Vector C: [0, 0, 0, 0, 0, 0, 1, 1, 1, 1]
Calculating Cosine Similarity
By computing cosine similarity:
Similarity(A, B) = 0.67. Documents A and B are
quite similar, as they discuss machine learning-related
topics. Their cosine similarity of 0.67 indicates a strong
relationship.
Similarity(A, C) = 0. Documents A and C are
completely different, as one discusses machine learning
and the other discusses music. Their cosine similarity
is 0, meaning they are orthogonal.
Similarity(B, C) = 0. Documents B and C are also
unrelated, reinforcing that cosine similarity effectively
captures semantic relationships.
28
3.5 Conclusion
3.5 Conclusion
Cosine similarity is a powerful mathematical tool
for measuring the similarity between vectors, making it
widely applicable across various fields. By leveraging
the cosine of the angle between vectors rather than their
magnitudes, it provides a robust measure of similarity
that remains unaffected by differences in scale.
From a mathematical perspective, cosine similarity
is based on the dot product and the magnitude of
vectors, ensuring that only the directional alignment of
data points is considered. Its geometric interpretation
further enhances its utility, as it enables an intuitive
understanding of relationships in vector space—where
similarity corresponds to closely aligned vectors,
dissimilarity to orthogonal vectors, and opposition to
vectors pointing in opposite directions.
In data science, cosine similarity plays a
crucial role in tasks such as document retrieval,
recommendation systems, and text clustering, helping to
identify meaningful relationships in high-dimensional
data. A practical example demonstrates how the
measure can be applied to compare documents, showing
how shared terms influence similarity scores.
Ultimately, cosine similarity serves as a
fundamental metric in machine learning, natural
language processing, and numerous analytical
applications. Its ability to quantify similarity efficiently
makes it a vital tool for handling vast amounts of data,
reinforcing its significance in computational and real-
world contexts.
References
Understanding AI Similarity Search
From physics to data science: the beauty and
power of cosine similarity
The Role of Cosine Similarity in Vector Space
What is Cosine Similarity: A Comprehensive
Guide
Unveiling the Power of Cosine Similarity in Text
Analysis
About the Author
Jinsu Ann Mathew is a research scholar
in Natural Language Processing and Chemical
Informatics. Her interests include applying basic
scientific research on computational linguistics,
practical applications of human language technology,
and interdisciplinary work in computational physics.
29
About airis4D
Artificial Intelligence Research and Intelligent Systems (airis4D) is an AI and Bio-sciences Research Centre.
The Centre aims to create new knowledge in the field of Space Science, Astronomy, Robotics, Agri Science,
Industry, and Biodiversity to bring Progress and Plenitude to the People and the Planet.
Vision
Humanity is in the 4th Industrial Revolution era, which operates on a cyber-physical production system. Cutting-
edge research and development in science and technology to create new knowledge and skills become the key to
the new world economy. Most of the resources for this goal can be harnessed by integrating biological systems
with intelligent computing systems offered by AI. The future survival of humans, animals, and the ecosystem
depends on how efficiently the realities and resources are responsibly used for abundance and wellness. Artificial
intelligence Research and Intelligent Systems pursue this vision and look for the best actions that ensure an
abundant environment and ecosystem for the planet and the people.
Mission Statement
The 4D in airis4D represents the mission to Dream, Design, Develop, and Deploy Knowledge with the fire of
commitment and dedication towards humanity and the ecosystem.
Dream
To promote the unlimited human potential to dream the impossible.
Design
To nurture the human capacity to articulate a dream and logically realise it.
Develop
To assist the talents to materialise a design into a product, a service, a knowledge that benefits the community
and the planet.
Deploy
To realise and educate humanity that a knowledge that is not deployed makes no difference by its absence.
Campus
Situated in a lush green village campus in Thelliyoor, Kerala, India, airis4D was established under the auspicious
of SEED Foundation (Susthiratha, Environment, Education Development Foundation) a not-for-profit company
for promoting Education, Research. Engineering, Biology, Development, etc.
The whole campus is powered by Solar power and has a rain harvesting facility to provide sufficient water supply
for up to three months of drought. The computing facility in the campus is accessible from anywhere through a
dedicated optical fibre internet connectivity 24×7.
There is a freshwater stream that originates from the nearby hills and flows through the middle of the campus.
The campus is a noted habitat for the biodiversity of tropical Fauna and Flora. airis4D carry out periodic and
systematic water quality and species diversity surveys in the region to ensure its richness. It is our pride that the
site has consistently been environment-friendly and rich in biodiversity. airis4D is also growing fruit plants that
can feed birds and provide water bodies to survive the drought.