Cover page

Image Name: Way of Science: Hubble Space Telescope observations of a pair of very distant exploding stars,

called Type Ia supernovae (red spot at the centre near the galaxy), provide new clues about the accelerating

universe and its mysterious ”dark energy.” Astronomers used the telescope’s Advanced Camera for Surveys to

help pinpoint the supernovae, which are approximately 5 billion and 8 billion light-years from Earth. The farther

one exploded so long ago the universe may still have been decelerating under its own gravity. For full story:

https://hubblesite.org/contents/news-releases/2003/news-2003-12.html

Managing Editor Chief Editor Editorial Board Correspondence

Ninan Sajeeth Philip Abraham Mulamoottil K Babu Joseph The Chief Editor

Ajit K Kembhavi airis4D

Geetha Paul Thelliyoor - 689544

Arun Kumar Aniyan India

Sindhu G

Journal Publisher Details

Publisher : airis4D, Thelliyoor 689544, India

Website : www.airis4d.com

Email : nsp@airis4d.com

Phone : +919497552476

Editorial

by Fr Dr Abraham Mulamoottil

airis4D, Vol.3, No.3, 2025

www.airis4d.com

The cover page of this edition is the image from

the Hubble Space Telescope that captures observations

of two distant Type Ia supernovae (red spot near the

galaxy centre), located approximately 5 billion and 8

billion light-years from Earth. These exploding stars

provide critical insights into the universe’s accelerating

expansion and the mysterious force known as ”dark

energy.” The farther supernova’s explosion occurred

when the universe might still have been decelerating

due to gravity.

In ”Black Hole Stories-16,” Ajit Kembhavi

explores the physics of gravitational wave generation

and their cosmic sources. Gravitational waves

are produced by accelerated or decelerated matter

and energy, described by the stress-energy tensor,

analogous to electromagnetic waves from accelerated

charges. However, due to gravity’s weak interaction,

only massive cosmic events can generate detectable

waves. Key sources include: (1) burst sources (e.g.,

supernovae), (2) continuous sources (e.g., asymmetric

spinning objects), (3) binary systems (e.g., neutron

stars, black holes), and (4) stochastic backgrounds.

Binary systems, particularly those with compact objects

like black holes or neutron stars, lose energy through

gravitational waves, causing orbital decay and eventual

merger. These mergers, detected by observatories like

LIGO, provide critical insights into gravitational wave

physics. The article sets the stage for future discussions

on binary pulsars and gravitational wave discoveries.

Aromal P’s article ”X-ray Astronomy: Through

Missions” underscores the transformative inﬂuence of

X-ray astronomy missions in the 1990s. Signiﬁcant

satellites such as ARGOS, the Chandra X-ray

Observatory, and XMM-Newton revolutionised our

understanding of high-energy phenomena. ARGOS

(1999): Featured the Unconventional Stellar Aspect

(USA) experiment, oﬀering unprecedented X-ray timing

capabilities for studying Low Mass Binaries. Chandra

(1999): Launched with advanced optics and instruments

like HETG, LETG, HRC, and ACIS, Chandra provided

high-resolution imaging and spectroscopy. It made

groundbreaking discoveries, including evidence for dark

matter, X-ray pulsations from Jupiter, and resolved the

cosmic X-ray background into discrete sources. XMM-

Newton (1999): Equipped with EPIC, RGS, and OM

instruments, it enabled broadband analysis from visible

to X-ray ranges. Key achievements include measuring

supermassive black hole spin rates, studying tidal

disruption events, and mapping dark matter distribution.

These missions laid the foundation for modern X-ray

astronomy, inspiring future explorations and expanding

our knowledge of the universe.

The article ”Supernova: The Explosive Death

of Stars” by Sindhu G explores the dramatic and

transformative role of supernovae in the universe.

Supernovae are powerful stellar explosions that mark

the end of a star’s life, releasing immense energy

and dispersing heavy elements essential for cosmic

evolution. They are classiﬁed into two main types:

Type I: Lacks hydrogen in its spectrum and includes

Type Ia (white dwarf explosions in binary systems),

Type I b, and Type I c (massive stars that shed their

outer layers). Type II: Retains hydrogen and results from

the core collapse of massive stars, with subcategories

like II-P, II-L, and IIn. Supernovae are driven by either

thermonuclear explosions (Type I a) or core-collapse

(Type II, Ib, Ic). Their aftermath includes the formation

of neutron stars, pulsars, black holes, and supernova

remnants that enrich the interstellar medium. These

explosions are vital for: Element formation: Producing

heavy elements like iron and gold. Cosmic distance

measurement: Type Ia supernovae serve as ”standard

candles.” Star formation: Shock waves trigger the

birth of new stars. Planetary systems: Inﬂuencing the

chemical composition of planets. Notable historical

supernovae, such as SN 1054 (Crab Nebula) and SN

1987A, have provided critical insights into stellar

evolution. Future advancements, like the Vera C.

Rubin Observatory, promise to detect thousands of

supernovae annually, furthering our understanding of

the universe. Supernovae are not just destructive events

but are fundamental to the cosmic life cycle, shaping

the universe and enabling the existence of planets and

life.

The article ”Synthetic Biology: A Revolutionary

Scientiﬁc Frontier - Part 2” by Geetha Paul

explores recent advancements in synthetic biology,

a ﬁeld that combines biology, engineering, and

computational design to reprogram organisms for

innovative applications. Key breakthroughs include:

CRISPR-Cas9 Enhancements: Improved precision

in gene editing with tools like base editing, prime

editing, and Cas12/Cas13 variants for RNA editing and

diagnostics. DNA Synthesis Innovations: Advances

in enzymatic and chip-based DNA synthesis have

reduced costs and enabled the creation of longer,

more complex genetic sequences. Synthetic Cells

and Organs: Progress in creating artiﬁcial cells

and organoids for regenerative medicine and drug

testing. Synthetic Microbes: Engineered microbes for

biofuel production, pathogen detection, and therapeutic

applications. Machine Learning Integration: AI

tools like AlphaFold optimize protein design, genetic

circuits, and metabolic pathways. Synthetic Vaccines:

Development of mRNA vaccines (e.g., COVID-19) and

exploration of synthetic vaccines for cancer and HIV.

Sustainable Materials: Production of biodegradable

plastics, bio-based textiles, and lab-grown leather

using engineered microorganisms. Bioremediation:

Engineered microbes and plants for cleaning pollutants

like plastic waste and heavy metals. These

advancements highlight synthetic biology’s potential to

address global challenges in healthcare, sustainability,

and environmental remediation. However, the

ﬁeld raises ethical, social, and regulatory concerns,

necessitating responsible use and equitable distribution

of beneﬁts. Synthetic biology promises to transform

industries and improve lives, paving the way for a

healthier, more sustainable future.

The article ”Principal Component Analysis” by

Linn Abraham explains Principal Component Analysis

(PCA), a fundamental technique in machine learning

used for feature extraction, dimensionality reduction,

and data visualization. PCA identiﬁes a smaller set

of features (principal components) that capture the

maximum variance in the data, reducing noise and

improving machine learning results. Key steps in

PCA include: 1. Covariance Matrix: Computes how

variables in the dataset vary together. 2. Eigenvectors

and Eigenvalues: Eigenvectors represent directions

of maximum variance, and eigenvalues indicate the

magnitude of variance along these directions. 3.

Principal Components: The eigenvectors corresponding

to the largest eigenvalues are the principal components,

which form a new basis for the data. 4. Dimensionality

Reduction: By selecting a subset of principal

components that explain most of the variance, the data

can be represented in a lower-dimensional space.

The PCA algorithm involves: Centering the data

by subtracting the mean. Computing the covariance

matrix. Finding eigenvectors and eigenvalues. Sorting

and selecting components based on eigenvalues. and

Projecting the data onto the selected components. PCA

is widely used in various ﬁelds, including astronomy,

for tasks like galaxy classiﬁcation and solar ﬂare

prediction. The article provides a mathematical

foundation and practical steps for implementing PCA,

making it a powerful tool for simplifying and analyzing

complex datasets. The article ”An Introduction to

Parallel Computing” by Ajay Vibhute explores the

evolution of computing from its early beginnings with

Charles Babbage’s Analytical Engine to modern parallel

computing systems. Parallel computing, which enables

multiple tasks to be executed simultaneously, has

iii

become essential for achieving high performance and

eﬃciency in today’s systems.

The article focuses on two key memory

architectures used in parallel computing: 1. Shared

Memory Architecture: All processors access a single

memory pool, simplifying data exchange but requiring

synchronization to avoid conﬂicts. Divided into

Uniform Memory Access (UMA), where all processors

have equal access time, and Non-Uniform Memory

Access (NUMA), where processors have faster access

to local memory. Limited scalability due to memory

contention as the number of processors increases.

Distributed Memory Architecture: Each processor

has its own dedicated memory, improving scalability

and fault tolerance. Requires explicit communication

between processors, often using protocols like Message

Passing Interface (MPI). Includes compute clusters,

grid computing, and cloud computing, which allow

systems to scale by adding more processors and memory

units. The article concludes that the choice between

shared and distributed memory architectures depends

on the application’s needs, scalability requirements, and

available resources. A hybrid approach combining both

architectures is often used to optimize performance and

resource utilization in high-performance systems.

The article ”Understanding Cosine Similarity:

A Mathematical Perspective” by Jinsu Ann Mathew

explores the mathematical foundations and practical

applications of cosine similarity, a key metric in data

science and machine learning. Cosine similarity

measures the alignment between two vectors by

calculating the cosine of the angle between them,

focusing on direction rather than magnitude. This

makes it particularly useful in high-dimensional

spaces, such as natural language processing (NLP),

recommendation systems, and clustering algorithms.

Mathematically, cosine similarity is derived from the

dot product of vectors normalized by their magnitudes,

ensuring scale invariance. It ranges from -1 (opposite

directions) to 1 (identical directions), with 0 indicating

orthogonality. The article highlights its geometric

interpretation, comparing it to vector interactions in

physics, and demonstrates its utility in text analysis,

where documents are represented as vectors. Practical

examples illustrate how cosine similarity can identify

similar or unrelated documents based on shared terms.

The author concludes that cosine similarity is a versatile

and powerful tool for quantifying relationships in

data, making it indispensable in ﬁelds like NLP,

recommendation systems, and anomaly detection.

Its ability to focus on directional alignment rather

than absolute values ensures its relevance in both

computational and real-world applications.

News Desk

Humanity at Cross Roads?

Students from Christian College, Chengannur doing a project in the biosciences lab.

Dr Biju K.G from WMO College, Wayanad and Dr Padmakumar from MG College, Trivandrum on their visit to

airis4D.

Contents

Editorial ii

I Astronomy and Astrophysics 1

1 Black Hole Stories-16

The Generation of Gravitational Waves 2

1.1 Generation of Gravitational Radiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

2 X-ray Astronomy: Through Missions 5

2.1 Satellites in 1990s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

3 Supernova: The Explosive Death of Stars 9

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

3.2 What is a Supernova? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

3.3 Types of Supernovae . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

3.4 Causes of Supernovae . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

3.5 The Aftermath of a Supernova . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

3.6 The Role of Supernovae in Cosmic Evolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

3.7 Importance of Supernovae in Astronomy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

3.8 Observing Supernovae . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

3.9 Notable Supernovae in History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

3.10 Future Research and Supernova Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

3.11 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

II Biosciences 12

1 Synthetic Biology: A Revolutionary Scientiﬁc Frontier - Part 2

Recent Advancements in Synthetic Biology 13

1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

1.2 A Brief Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

1.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

III Computer Programming 19

1 Principal Component Analysis 20

1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

1.2 Covariance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

1.3 Eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

1.4 Principal Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

1.5 The PCA algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

CONTENTS

2 An Introduction to Parallel Computing 23

2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

2.2 Memory Architectures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

2.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

3 Understanding Cosine Similarity: A Mathematical Perspective 26

3.1 Mathematical Formula for Cosine Similarity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

3.2 Geometric Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3.3 Cosine Similarity in Data Science . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3.4 Understanding Cosine Similarity Through a Practical Example . . . . . . . . . . . . . . . . . . . 28

3.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

vii

Part I

Astronomy and Astrophysics

Black Hole Stories-16

The Generation of Gravitational Waves

by Ajit Kembhavi

airis4D, Vol.3, No.3, 2025

www.airis4d.com

In this story we will consider the physics of the

generation of gravitational waves, and the nature of the

cosmic sources which generate detectable gravitational

waves.

1.1 Generation of Gravitational

Radiation

In electromagnetism, electric charges and currents

are the source of electric and magnetic ﬁelds. A

distribution of static electric charges produces a static

electric ﬁeld, while a steady electric current produces

a steady magnetic ﬁeld. The situation becomes much

more interesting when electric charges are accelerated,

or there are changing electric currents, since these

produce electromagnetic waves which travel away from

the sources, carrying energy and information to distant

parts.

We have found in BHS-15 that for weak

gravitational ﬁelds, Einstein’s ﬁeld equations of

gravitation take the form of a wave equation, as in

the case of electromagnetism. The source of the

gravitational waves here are matter and energy, which

are described by a quantity known as the stress energy

tensor. Matter at rest or in uniform motion in a given

frame of reference does not produce gravitational

waves, just as electric charges at rest or moving

uniformly do not produce electromagnetic waves. It

is matter in accelerated (or decelerated) motion and

energy which act as sources of gravitational waves.

While every object, however small its mass, would

radiate gravitational waves when accelerated, the very

weak nature of the gravitational interaction means that

such waves produce by terrestrial objects would be

impossible to detect. This is very diﬀerent from the case

with electromagnetism, where the interaction between

electromagnetic waves and charged particles is so strong

that these waves can be easily produced and detected.

So a simple cell phone can communicate with an Earth

orbiting satellite, and we can detect signals sent out by

satellites at the far reaches of the Solar system.

The nature of the sources which can emit

gravitational radiation follows from a detailed analysis

of the gravitational wave equation. It can be shown,

for example, that an expanding or contracting uniform

sphere of matter, accelerating during the expansion and

contraction, will not emit gravitational waves, because

of the spherical symmetry. Similarly, a cylinder or a

disc of matter or a dumbbell which is rotating around

its axis of symmetry cannot emit gravitational radiation

because of the cylindrical symmetry. But the same

dumbbell rotating around an axis perpendicular to

its length can emit gravitational radiation as shown

in Figure 1. The emitted gravitational radiation has

diﬀerent components, the most important of which is

called quadrupole radiation. The other components

present are much weaker than the quadrupole part.

1.1 Generation of Gravitational Radiation

Figure 1: A dumbbell which is symmetric around the

axis A and is asymmetric around axis B. It will not emit

gravitational waves if it rotates around axis A but will

emit these waves while rotating around axis B.

Image Credit: Kaushal Sharma.

We have to depend on cosmic sources for

potentially detectable gravitational wave emission.

These sources are mainly of four types: (1) Burst

sources, like supernova explosions and Gamma-ray

bursts, which are short-duration one oﬀ events; (2)

Continuous sources, like spinning, compact objects

with some asymmetry which can emit gravitational

waves at the same frequency for a long duration; (3)

Binary sources with the two components, each being a

compact object which is a white dwarf, neutron star or a

black hole; (4) Stochastic background sources which are

collections of a large number of weak sources, which

cannot individually be detected but can be collectively

be observed as a background of gravitational waves.

Here we will consider only stellar binary sources, which

are bound systems of two stellar mass objects which are

in orbit around each other. The objects can be normal

or evolved stars or compact remnants of stars which

have ﬁnished their evolution, which are white dwarfs,

neutron stars or black holes. The binary can be made

up of any combination of these objects.

It can be shown from a detailed analysis of

the gravitational wave equation that a binary system

consisting of two objects with the same mass M going

round each other in a circular orbit with period P would

emit gravitational radiation with luminosity (energy

emitted per unit time)

Inserting numerical constants, and expressing the

mass M in units of Solar mass and the orbital period

in hours, the gravitational wave luminosity can be

expressed as

A compact binary with Solar mass stars and a

period of 1h would therefore have a gravitational wave

luminosity comparable to the energy emitted by the Sun

over infrared to ultraviolet wavelengths. The luminosity

would be higher for more massive objects and smaller

separations, but would be lower for larger separations.

The emission of gravitational waves causes the

binary to lose energy, so that the total energy becomes

more negative, and the separation of the components

decreases, as discussed in BHS-7. This leads to decrease

in the orbital period at a rate given by

Introducing constants, the expression reduces to

Since the period P has the units of time, dP/dt is

a dimensionless quantity (the above expressions are

all taken from the book Gravity An Introduction to

Einstein’s General Relativity by James B. Hartle.

The expressions can be generalised to unequal

mass components and orbits which have eccentricity,

i.e. they have an elliptical shape. The radiation is

emitted at a fundamental frequency which is twice

the orbital frequency, and at higher harmonics of the

fundamental, with the number of harmonics depending

on the eccentricity.

The energy loss due to the emission of gravitational

waves causes the system to shrink in size, which is

indicated by a decrease in its orbital period P, which

leads to increase in the frequency with which the two

bodies go round each other. The period changes more

rapidly as P decreases, and eventually the two objects

1.1 Generation of Gravitational Radiation

should collide and merge together. That is possible

when the bodies are compact objects like neutron stars or

black holes. In fact, all the gravitational wave emitting

binary systems discovered by the LIGO gravitational

wave detectors are black hole binaries, neutron stars

binaries or binary systems with one component a

neutron star and the other a black hole. All these

binaries were detected close to the merger, when the

gravitational wave luminosity is large enough to be

detected and the change in frequency and the merger

occur over just a fraction of a second. We will describe

these detections in greater detail in future stories.

The situation is more complicated for binary

systems which consist of normal or evolved stars. As

they continue to shrink in size, each star in the binary

exerts greater gravitational force on the other, which

leads to change in their shape from a nearly spherical

form. If the binary is very compact, the star with the

greater mass could ﬁll an equipotential surface known

as a Roche lobe, which leads to mass transfer from the

Roche ﬁlling star to the other star, which changes the

nature of both stars. Matter can also leave the system.

For these eﬀects to occur, the binary will need to be very

compact, so that there is copious gravitational wave

emission leading to contraction. But most stellar binary

systems have a separation of 1 AU or more, where AU

is the Astronomical Unit, which is equal to the Sun-

Earth distance of about 150 million km. Such a binary

for Solar mass stars would have a period of a fraction

of a year, i.e. several thousand hours. Such binaries

would have very feeble gravitational wave emission,

leading to very slow contraction. For rapid contraction

eventually leading to coalescence, we need binaries of

short periods with the binary components made up of

compact objects. We will come across such binaries in

our future stories.

Next Story: In the next story we will consider the

discovery and formation of the binary radio pulsar PSR

1913+16, the discovery of which in the early 1970s led

to the ﬁrst evidence that gravitational waves exist.

About the Author

Professor Ajit Kembhavi is an emeritus

Professor at Inter University Centre for Astronomy

and Astrophysics and is also the Principal Investigator

of the Pune Knowledge Cluster. He was the former

director of Inter University Centre for Astronomy and

Astrophysics (IUCAA), Pune, and the International

Astronomical Union vice president. In collaboration

with IUCAA, he pioneered astronomy outreach

activities from the late 80s to promote astronomy

research in Indian universities.

X-ray Astronomy: Through Missions

by Aromal P

airis4D, Vol.3, No.3, 2025

www.airis4d.com

2.1 Satellites in 1990s

The huge success of RXTE and BeppoSAX gave

a supersonic boost to X-ray astronomy. Both satellites

inﬂuenced the upcoming missions and proved that

many things are hidden in these energy ranges. Even

though RXTE was a huge success and had great timing

capabilities, it did not have any optics that made

localization of sources diﬃcult, and its spectra were not

that great for understanding many small-scale spectral

variations. In the latter half of the 1990s, we saw many

exciting missions that ﬁlled the inabilities of RXTE,

and today, we are going to discuss those game-changers.

2.1.1 ARGOS (USA)

The Advanced Research and Global Observations

Satellite (ARGOS), also known as the NRL-801

Experiment, was funded by the U.S. Department of

Defense Space Test Program and operated by the U.S.

Air Force. This satellite was planned as a technology

demonstrator for new detectors. It was launched in Feb

1999 into an 800 km circular sun-synchronous orbit

with a 98.7°inclination and was decommissioned in

2003. ARGOS had nine experiments among which

one was for high-energy astrophysical studies. We will

focus solely on that experiment here.

Unconventional Stellar Aspect (USA) is a low-

cost X-ray timing experiment with unprecedented

timing capability of 1

s. USA was operated

for nearly 16 months starting from April

1999. It consisted of large-area gas scintillation

proportional counters that worked in a 1-15 keV

energy range. It had an eﬀective area of 2000

USA tracked bright X-ray sources without

commands from the ground stations. It provided timing

and spectral information about diﬀerent Low Mass

X-ray Binaries.

2.1.2 CHANDRA

Chandra was another ﬂagship project that

revolutionized our ideas about high-energy phenomena.

The satellite was launched in July 1999 in space shuttle

Columbia into a highly elliptical orbit of perigee 10,000

km and apogee 140,000 km with an inclination angle

of 28.5

○

. It revolved around the earth in 64 hours,

which gave longer observation time to a source without

interruption. Chandra X-ray Observatory was named

after Indian-American scientist Dr. Subrahmanyan

Chandrasekhar. Although initially planned for 5

years, the satellite is still in operation, giving valuable

data and redeﬁning our understanding of the X-ray

sky. Chandra surpassed all its ancestors in angular

resolution by an order of 3 magnitude, which made

the satellite detect and study close-by X-ray emitting

sources and also helped with the precise positioning

of X-ray sources. Chandra had a High-Resolution

Mirror Assembly (HRMA), a multi-mirror array for

focusing highly energetic X-rays through the grazing

angle principle. HRMA had a diameter of 1.2 m, a

length of 80 cm, and a focal length of 10m, with two

scientiﬁc instruments located at its focal plane. Chandra

used a high-resolution spectroscopic instrument, which

showed small deviations in the spectrum.

Chandra had 4 scientiﬁc instruments:

High Energy Transmission Grating (HETG)

2.1 Satellites in 1990s

Figure 1: Chandra X-ray Observatory.

Credits: NASA.

consisted of 336 grating facets for high-resolution

spectroscopy, with a spectral resolving power of

up to 1000. It worked with ACIS spectra detectors

in an energy range of 0.4-10 keV.

Low Energy Transmission Grating (LETG)

consisted of 540 grating elements, giving it the

highest spectral resolving power among all the

instruments on Chandra. It had a resolving power

of more than 1000 in the lower energy ranges of

0.07-0.2 keV. The instrument covers an energy of

0.07-7.29 keV.

High Resolution Camera (HRC) consisted of

two microchannel plate (MCP) (for imaging and

spectroscopy) detectors. Each MCP was made

of a 10-cm square cluster of 69 million tiny lead-

oxide glass tubes that were about 10 micrometers

in diameter, which is responsible for the high

spatial resolving power of the instrument. HRC

was located in the focal plane of the HRMA. It

works in the energy range 0.08-10 keV, having a

maximum eﬀective area of 0.277 keV: 133 cm

for

the imaging detectors. Imaging detectors had a

Field of View (FOV) of 16.9 × 16.9 arcmin.

Advanced CCD Imaging Spectrometer (ACIS)

consisted of 10 charge-coupled devices arranged

in two arrays, one optimized for imaging

wide ﬁelds(FOV:30x30 arcmin) and the other

optimized for readout of HETG(FOV: 6 x 90

arcmin). This was the second instrument located

at the focal plane, and this worked in 0.2-10 keV

of energy range with an eﬀective area of 227 cm

at one keV for the imaging part.

Chandra revealed many secrets of the universe,

including strong evidence for dark matter by studying

colliding galaxy clusters, such as the Bullet Cluster.

It made the ﬁrst X-ray detection of Sagittarius A*,

the supermassive black hole at the center of our

galaxy. Chandra also contributed to the discovery

of intermediate-mass black hole candidates and studied

possible supermassive black hole binaries. Additionally,

it detected X-ray pulsations from Jupiter’s poles.

Chandra’s Deep Field observations resolved 95% of

the cosmic X-ray background into discrete sources,

primarily supermassive black holes in galactic centers,

shedding light on their rapid growth in the early

universe. Chandra also provided evidence for a possible

extragalactic exoplanet in the Whirlpool Galaxy (M51)

using the transit method and studied space weather

phenomena around exoplanets in Alpha Centauri. After

25 years of service, Chandra continues to provide

valuable scientiﬁc data, expanding our knowledge of

the universe.

2.1.3 XMM-NEWTON

X-ray Multi-mirror Mission (XMM) - Newton

was launched by the European Space Agency (ESA) in

December 1999. It was launched into a high elliptical

orbit of 7,000 km apogee and 114,000 km perigee with

an inclination of 40

○

, over time its orbit was extended.

The high elliptical orbit given a orbital period of 48

hours, which provided a non-obscured observation

of X-ray sources for a long time. Initially, XMM-

Newton was planned for 10 years, and currently, it is

entering its 25th year in observing outer space, with

all the instruments still working properly. The satellite

functions in the 0.1-15 keV energy range, covering from

visible light to medium X-ray ranges. XMM-Newton

observatory features three advanced X-ray telescopes,

each consisting of a sophisticated mirror assembly. Each

mirror module comprises 58 nested Wolter I grazing-

incidence mirrors. These mirror modules focus on

the coming X-rays, and the scientiﬁc instruments in

XMM-Newton are located at the focal plane of the three

mirrors.

XMM-Newton has three scientiﬁc instruments on

board:

2.1 Satellites in 1990s

Figure 2: XMM-Newton Observatory Credits: ESA

The European Photon Imaging Cameras

(EPIC) - the primary instrument of XMM-

Newton, consists of two Metal Oxide

Semiconductor(MOS) CCD cameras and one pn-

CCD camera. Each of them is placed at the focus

of each incidence mirror. EPIC provides imaging

and spectroscopic data in an energy range of

0.2-12 keV. EPIC-MOS is designed for doing

studies in low energy X-ray range, providing

better imaging and spectroscopic capabilities

than pn-CCD, but the latter has higher timing

capabilities than EPIC-MOS. The FOV for both

the MOS and pn cameras in the EPIC system is

30 arcminutes. The combined eﬀective area of

EPIC detectors is nearly 1,500 cm

at 1 keV.

The Reﬂection Grating Spectrometers (RGS)

consist of a combination of reﬂection gratings

and CCD detectors to achieve high-resolution

soft X-ray spectroscopy in the energy range of

0.3–2.1 keV. RGS provides a spectra resolving

power of 100-500 across the energy range, which

helps to detect diﬀerent molecular transmissions

dominating soft X-rays. RGS has two working

modes: Spectral mode and Timing mode. Two

identical RGS units are located behind the second

and third mirror assembly.

The Optical/UV Monitor (OM) is 30-cm Ritchey-

Chretien telescope which will give simultaneous

UV and Optical data. It is co-aligned with the X-

ray telescope and works alongside it. It has

three optical and three UV sensors and two

grisms (grating and a prism) for low-energy

spectroscopic studies. It works in a wavelength

of 180-600 nm with a FOV of 17 arcminutes

square.

With the three scientiﬁc instruments, XMM-Newton

provided a broadband analysis ranging from the visible

range to the X-rays. XMM-Newton achieved the ﬁrst

measurement of a supermassive black hole’s spin rate

in galaxy NGC 1365, revealing rapid rotation that

informs galaxy evolution models. It also detected quasi-

periodic oscillations (QPOs) from the supermassive

black hole 1ES 1927+654. XMM-Newton studied the

tidal disruption events when the matter from a star is

accreted and torn apart by a black hole. Its deep surveys

aided in mapping dark matter distribution through

galaxy cluster studies. It also discovered unusually cold

neutron stars, which challenged the existing models of

stellar evolution. XMM-Newton provided signiﬁcant

information that marked a cornerstone for high-energy

astrophysics in its 25-year-long and still continuing

studies and surveys.

Satellites in the 1990s inspired the coming

generation to hope and aim for big satellites and science

goals. X-ray astronomy got a huge boost, and in the later

ages to come, it accelerated the quest for the unknown.

We will discuss the satellites that came after this in the

upcoming articles.

References

Santangelo, Andrea and Madonia, Rosalia and

Piraino, Santina A Chronological History of X-

Ray Astronomy Missions.Handbook of X-ray and

Gamma-ray Astrophysics.ISBN 9789811645440

K.S.Wood, G.Fritz, P.Hertz, W.N.Johnson,

M.P.Kowaiski, M.N.Lovellette, M.T.Wolﬀ,

D.J.Yentis E.Bloom, L.Cominsky, K.Fairﬁeld,

G.Godfrey, J.Hanson, A.Lee,P.Michelson,

R.Taylor and H.Wen THE USA EXPERIMENT

ON THE ARGOS SATELLITE: A LOW COST

INSTRUMENT FOR TIMING X-RAYBINARIES

SPIE Vol.2280

X-ray and Gamma-ray Missions

Chandra Science Instruments

Chandra Instruments and Calibration

2.1 Satellites in 1990s

Chandra Spacecraft and Instruments

Chandra X-ray Observatory

XMM-Newton Instruments

Megan Masterson, Erin Kara, Christos

Panagiotou, William N. Alston, Joheen

Chakraborty, Kevin Burdge, Claudio Ricci,

Sibasish Laha, Iair Arcavi, Riccardo Arcodia,

S. Bradley Cenko, Andrew C. Fabian, Javier

A. Garc

ıa, Margherita Giustini, Adam Ingram,

Peter Kosec, Michael Loewenstein, Eileen T.

Meyer, Giovanni Miniutti, Ciro Pinto, Ronald

A. Remillard, Dev R. Sadaula, Onic I. Shuvo,

Benny Trakhtenbrot, Jingyi Wang Milliher tz

Oscillations Near the Innermost Orbit of a

Supermassive Black Hole arXiv:2501.01581

X-ray Satellite XMM-Newton Celebrates 20

Years in Space

XMM-Newton factsheet

M. G¨udel A decade of X-ray astronomy with

XMM-Newton A&A 500, 595–596 (2009) DOI:

10.1051/0004-6361/200912208

About the Author

Aromal P is a research scholar in

Department of Astronomy, Astrophysics and Space

Engineering (DAASE) in Indian Institute of

Technology Indore. His research mainly focuses on

studies of Thermonuclear X-ray Bursts on Neutron star

surface and its interaction with the Accretion disk and

Corona.

Supernova: The Explosive Death of Stars

by Sindhu G

airis4D, Vol.3, No.3, 2025

www.airis4d.com

Figure 1: An artist’s interpretation of a generic

supernova. (Image Credit: Soubrette/iStock/Getty

Images Plus)

3.1 Introduction

Supernovae are among the most powerful and

fascinating events in the universe. These stellar

explosions release immense amounts of energy,

outshining entire galaxies for a short period. They play

a crucial role in the cosmic cycle of matter, dispersing

heavy elements into space, inﬂuencing star formation,

and even impacting planetary systems. This article

explores the diﬀerent types of supernovae, their causes,

their role in cosmic evolution, and their signiﬁcance in

astrophysics.

3.2 What is a Supernova?

A supernova occurs when a star undergoes a

catastrophic explosion, resulting in an intense burst

of energy. This explosion marks the end of a star’s life

cycle and can be observed across vast cosmic distances.

Supernovae are classiﬁed into two main types: Type I

and Type II, based on their underlying mechanisms and

the presence or absence of hydrogen in their spectra.

3.3 Types of Supernovae

3.3.1 Type I Supernovae

Type I supernovae lack hydrogen lines in their

spectra and are further divided into subcategories:

Type Ia: These occur in binary star systems

where a white dwarf accretes matter from its

companion. When the white dwarf reaches

the Chandrasekhar limit ( 1.4 solar masses), it

undergoes runaway nuclear fusion, leading to a

thermonuclear explosion.

Type Ib and Ic: These result from massive stars

that have shed their outer hydrogen layers before

collapsing. Type Ib still retains helium, whereas

Type Ic lacks both hydrogen and helium.

3.3.2 Type II Supernovae

Type II supernovae retain hydrogen in their spectra

and originate from massive stars undergoing core

collapse. These stars exhaust their nuclear fuel, leading

to gravitational collapse and a subsequent explosion.

They are further classiﬁed into:

Type II-P: Characterized by a plateau in their

light curves, indicating a prolonged phase of

hydrogen recombination.

Type II-L: Display a linear decline in brightness

over time.

Type IIn: Show narrow spectral lines due to

strong interactions with surrounding material.

3.4 Causes of Supernovae

The mechanisms driving supernovae vary based

on the type:

Thermonuclear Explosions: In Type Ia

supernovae, the explosion results from the

detonation of a white dwarf in a binary system

after exceeding its mass limit.

Core Collapse: In Type II, Ib, and Ic supernovae,

a massive star undergoes gravitational collapse

when its core can no longer support the weight

of the outer layers. This collapse triggers a shock

wave that expels the star’s outer material into

space.

3.5 The Aftermath of a Supernova

The remnants of a supernova include:

Neutron Stars and Pulsars: If the core of a

collapsed star remains between 1.4 and 3 solar

masses, it forms a neutron star, an incredibly

dense object composed primarily of neutrons.

Black Holes: If the core’s mass exceeds the

Tolman–Oppenheimer–Volkoﬀ limit, it collapses

into a black hole.

Supernova Remnants: Expelled material

enriches the interstellar medium with elements

like iron, oxygen, and silicon, which contribute

to the formation of new stars and planets.

3.6

The Role of Supernovae in Cosmic

Evolution

Supernovae are not just destructive events; they are

crucial to the life cycle of the universe. These explosions

distribute heavy elements essential for the formation

of planets and biological life. Without supernovae,

elements like carbon, oxygen, and iron would not exist

in abundance, making planetary formation and life

as we know it impossible. Supernovae also regulate

star formation by injecting energy into the interstellar

medium, creating shock waves that compress gas clouds,

leading to the birth of new stars.

3.7 Importance of Supernovae in

Astronomy

Supernovae play a critical role in shaping the

cosmos:

Element Formation: Heavy elements such

as gold, silver, and uranium are produced

in supernova nucleosynthesis and distributed

throughout the universe.

Cosmic Distance Indicators: Type Ia

supernovae serve as standard candles for

measuring astronomical distances, aiding in our

understanding of the universe’s expansion.

Triggering Star Formation: Shock waves

from supernovae can compress interstellar clouds,

leading to the birth of new stars.

Inﬂuencing Planetary Systems: The chemical

enrichment from supernovae aﬀects planetary

composition and the potential for life.

3.8 Observing Supernovae

Astronomers detect supernovae using telescopes

equipped with optical, X-ray, and radio instruments.

Observatories like the Hubble Space Telescope, the

Chandra X-ray Observatory, and ground-based facilities

such as the Keck Observatory monitor these explosive

events. Citizen science projects and automated surveys,

like the All-Sky Automated Survey for Supernovae

(ASAS-SN), contribute signiﬁcantly to supernova

research.

3.9 Notable Supernovae in History

Several supernovae have been observed throughout

history, contributing to our understanding of these

cosmic events:

SN 1054: Observed by Chinese and Middle

Eastern astronomers, it led to the formation of

the Crab Nebula.

SN 1572 (Tycho’s Supernova): Studied by Tycho

Brahe, this event provided evidence against the

Aristotelian belief in an unchanging celestial

sphere.

3.10 Future Research and Supernova Detection

SN 1604 (Kepler’s Supernova): Documented

by Johannes Kepler, it was the last supernova

observed in the Milky Way.

SN 1987A: One of the closest and most studied

supernovae, providing insights into core-collapse

mechanisms.

3.10 Future Research and Supernova

Detection

With advancements in technology, astronomers

continue to discover and analyze supernovae. The Vera

C. Rubin Observatory is expected to detect thousands

of supernovae per year, improving our understanding

of stellar evolution, dark energy, and the structure

of the universe. Future space missions will further

enhance our ability to observe and interpret these stellar

explosions.

3.11 Conclusion

Supernovae are essential to our understanding of

stellar evolution, cosmic chemistry, and the universe’s

expansion. These explosions not only mark the end

of massive stars but also seed the universe with the

building blocks necessary for planets and life. Ongoing

studies continue to uncover new insights into the physics

of these stellar cataclysms, helping us better understand

the cosmos.

References:

Supernova

A Brief Review of Historical Supernovae

History of supernova observation

Supernova explosions and historical chronology

What is a supernova?

About the Author

Sindhu G is a research scholar in Physics

doing research in Astronomy & Astrophysics. Her

research mainly focuses on classiﬁcation of variable

stars using diﬀerent machine learning algorithms. She

is also doing the period prediction of diﬀerent types

of variable stars, especially eclipsing binaries and on

the study of optical counterparts of X-ray binaries.

Part II

Biosciences

Synthetic Biology: A Revolutionary Scientiﬁc

Frontier - Part 2

Recent Advancements in Synthetic Biology

by Geetha Paul

airis4D, Vol.3, No.3, 2025

www.airis4d.com

1.1 Introduction

Synthetic biology is no longer a futuristic

dream; it’s a rapidly evolving reality. Recent

advancements transform our ability to design and build

biological systems with unprecedented precision. From

engineering microbes to produce life-saving drugs to

creating sustainable alternatives to fossil fuels, the

potential of synthetic biology seems limitless. In the

ever-evolving landscape of science and technology,

synthetic biology stands out as a revolutionary ﬁeld

that merges biology, engineering, and computational

design to reshape the boundaries of what is possible.

By reprogramming the genetic code of organisms,

scientists can now create novel biological systems

or redesign existing ones to address some of the

most pressing challenges facing humanity. From

sustainable energy production and environmental

remediation to groundbreaking medical therapies and

bio-manufacturing, synthetic biology is paving the

way for a future where biology is as programmable

as software.

Recent advancements in this ﬁeld have accelerated

its potential, driven by breakthroughs in gene editing

technologies like CRISPR, the development of artiﬁcial

intelligence-driven design tools, and the synthesis

of increasingly complex genetic circuits. These

innovations are not only expanding our understanding

of life. Still, they are also enabling the creation of

organisms with entirely new functions, opening doors

to applications that were once the realm of science

ﬁction.

This article explores the latest breakthroughs in

synthetic biology, highlighting how these advancements

are transforming industries, addressing global

challenges, and raising important ethical and societal

questions. As we stand on the brink of a new era

in biological engineering, the implications of these

discoveries are profound, promising to reshape our

world in ways we are only beginning to imagine.

1.2 A Brief Overview

1.2.1 CRISPR-Cas9 EnhancementsCRISPR

Cas9 is a groundbreaking tool in synthetic biology

that allows scientists to edit genes with high precision

with fewer unwanted side eﬀects and can do more things.

It has become more accurate and versatile. Base editing

and prime editing are new methods that allow scientists

to make tiny, speciﬁc changes to DNA without making

big cuts, resulting in more precise gene modiﬁcations.

CRISPR-Cas variants like Cas12 and Cas13 are also

being explored for RNA editing and diagnostics.

1.2 A Brief Overview

Figure 1: The ﬁgure illustrates the Cas9 enzyme

(blue) generates breaks in double-stranded DNA by

using its two catalytic centers (blades) to cleave each

strand of a DNA target site (gold) next to a PAM

sequence (red) and matching the 20-nucleotide sequence

(orange) of the single guide RNA (sgRNA). The sgRNA

includes a dual-RNA sequence derived from CRISPR

RNA (light green) and a separate transcript (tracrRNA,

dark green) that binds and stabilizes the Cas9 protein.

Cas9-sgRNA–mediated DNA cleavage produces a blunt

double-stranded break that triggers repair enzymes to

disrupt or replace DNA sequences at or near the cleavage

site. Catalytically inactive forms of Cas9 can also be

used for programmable regulation of transcription and

visualization of genomic loci.

Image Courtesy:

www.ndsu.edu/pubweb/

∼

mcclean/ctig/ctigfall2016/d

oudna-and-charpentier-the-new-frontier-of-genom

e-engineering-with-CRISPR-Cas9.pdf

1.2.2 DNA Synthesis Innovations

Advances in DNA synthesis technologies,

such as enzymatic DNA synthesis and chip-based

oligonucleotide synthesis, have signiﬁcantly reduced

costs and increased the speed of gene assembly.

Companies like Twist Bioscience and DNA Script are

pioneering these methods, enabling the synthesis of

longer and more complex DNA sequences.

Figure 2: The synthetic biology test cycle. (From

the top, clockwise) Synthetic DNA constructs are

designed and manipulated using computer-aided design

software. The designed DNA is then divided into

synthesisable pieces (synthons) up to 1–1.5 kbp. The

synthons are then broken into overlapping single-

stranded oligonucleotide sequences and chemically

synthesised. The oligonucleotides are then assembled

together into the designed synthons using gene synthesis

techniques. Multiple synthons can be assembled into

larger DNA assemblies or devices if necessary. The

assembled DNAs are then typically cloned into an

expression vector and sequence-veriﬁed. Once veriﬁed,

the synthetic constructs are transformed into a cell

and the function of the synthetic construct is assayed.

Depending on the results, the constructs can then be

modiﬁed or reﬁned, and the test cycle is repeated until

a DNA construct is obtained that produces the desired

function.

Image Courtesy:

https://pmc.ncbi.nlm.nih.gov/articles/PMC5204324/

1.2.3 Synthetic Cells and Organs

Researchers are making strides in creating

synthetic cells and organs by combining synthetic

biology with tissue engineering. For example,

artiﬁcial cells capable of performing essential metabolic

functions and synthetic organoids for drug testing are

being developed. These advancements hold promise

for regenerative medicine and personalised healthcare.

1.2 A Brief Overview

Figure 3: The ﬁgure illustrates the modular approach

for building synthetic cells with cell-like properties.

The integration of functional modules creates synthetic

cells with increasing complexity.

Image Courtesy:

https://pmc.ncbi.nlm.nih.gov/articles/PMC9314110/

1.2.4 Synthetic Microbes

Engineered microbes are being designed for

applications ranging from biofuel production to disease

treatment. Recent work includes the development

of synthetic bacteria that can detect and destroy

pathogens or produce therapeutic compounds in the

gut microbiome.

Figure 4: The ﬁgure shows future food challenges

which give rise to opportunities for synthetic biology.

(A) Synthetic biology tries to implement engineering

principles into life. The lightbulb highlights some of

the challenges for future foods. These challenges may

be inspirational for experimental designs for synthetic

biology methodology with the potential to improve

a process or overcome related problems. Microbes

can be altered through the “Design-Build-Test-Learn”

cycle for a greater aim and particular microbes from

traditional fermentation processes have the potential to

address future food challenges. (B) Workﬂow showing

a pipeline to domesticate microbes, for example from

traditional fermentations processes.

Image Courtesy:

https://pmc.ncbi.nlm.nih.gov/articles/PMC9523148/

The initial source can be analyzed by traditional

isolation of individual microbes or by metagenomics

approaches to initially get an overview of the community

before individuals are isolated. The isolated microbes

need to be identiﬁed and characterized. Once the

organism is known, one can start to make the organism

accessible for synthetic biology approaches. Therefore,

initial genetic engineering methods need to be

established (i.e., transformation procedures), followed

by advances in engineering tools (i.e., CRISPR/Cas-

based methods) and the generation of modular tool-

boxes for quick and reliable engineering of the organism.

The established tools allow microbe domestication, for

example by removal or addition of genes, for easier

handling. Subsequently the domesticated microbe can

1.2 A Brief Overview

be used for intensive engineering towards a desired goal

for example the assimilation of a sustainable feedstock.

1.2.5 Machine Learning and Data Science

Integration

Machine learning revolutionises synthetic biology

by enabling gene expression, protein folding, and

prediction of metabolic pathways. Tools like AlphaFold

have accelerated protein design, while AI-driven

platforms optimise genetic circuits and metabolic

engineering.

Figure 5: Shows the Machine learning and data science

integration. a, The performance of AlphaFold on the

CASP14 dataset (n = 87 protein domains) relative to

the top-15 entries (out of 146 entries), group numbers

correspond to the numbers assigned to entrants by CASP.

Data are median and the 95% conﬁdence interval of

the median, estimated from 10,000 bootstrap samples.

b, Prediction of CASP14 target T1049 (PDB 6Y4F,

blue) compared with the true (experimental) structure

(green). Four residues in the C terminus of the crystal

structure are B-factor outliers and are not depicted.

c, CASP14 target T1056 (PDB 6YJ1). An example

of a well-predicted zinc-binding site (AlphaFold has

accurate side chains even though it does not explicitly

predict the zinc ion). d, CASP target T1044 (PDB

6VR4)—a 2,180-residue single chain—was predicted

with correct domain packing (the prediction was made

after CASP using AlphaFold without intervention). e,

Model architecture. Arrows show the information ﬂow

among the various components described in this paper.

Array shapes are shown in parentheses with s, number

of sequences (Nseq in the main text); r, number of

residues (Nres in the main text); c, number of channels.

Image Courtesy:

https://www.nature.com/articles/s41586-021-03819-2

The combination of datasets allowed AlphaFold

to learn the complex relationships between amino

acid sequences and protein structures, leading to

its groundbreaking performance in protein structure

prediction.

1.2.6 Development of Synthetic Vaccines

Synthetic vaccines are vaccines designed and

constructed using synthetic biology and chemical

synthesis techniques. Unlike traditional vaccines

that use weakened or inactivated pathogens, synthetic

vaccines are created from synthesised components, such

as mimicking parts of the pathogen, encoding speciﬁc

antigens, assembled from synthetic proteins,combining

synthetic oligosaccharides with carrier proteins.

Synthetic biology has been pivotal in vaccine

development, particularly during the COVID-19

pandemic. mRNA vaccines, such as those developed

by Moderna and Pﬁzer-BioNTech, are a prime example

of synthetic biology. Researchers are now exploring

synthetic vaccines for other diseases, including cancer

and HIV.

Figure 6: A brief illustration on current chemical and

synthetic biology approaches for developing cancer

vaccines.

ImageCourtesy:

https://pmc.ncbi.nlm.nih.gov/article

s/PMC9611187/DCcell-Dentritecell.

1.2.7 Sustainable Materials Production

Synthetic biology enables the production of

sustainable materials, such as biodegradable plastics,

bio-based textiles, and lab-grown leather. Engineered

microorganisms convert renewable feedstocks into eco-

friendly materials, reducing reliance on fossil fuels.

1.2.8 Bioremediation Techniques

Engineered microbes and plants are being

developed to clean up environmental pollutants, such

1.3 Conclusion

as oil spills, heavy metals, and plastic waste. Recent

advancements include the creation of synthetic bacteria

that can degrade polyethylene terephthalate (PET)

plastics and detoxify contaminated soil.

Figure 7: A proof of concept for engineering

environmental microbiomes to rapidly degrade PET

plastics.

Image Courtesy:

https://pmc.ncbi.nlm.nih.gov/articles/PMC11420662/

Figure 8: Conjugation of pFAST-PETase-cis into

wastewater bacteria. (A) Schematic map of pFAST-

PETase-cis (not to scale): oriT, RK2/RP4 conjugative

origin of transfer; FAST-PETase, gene for FAST-PETase

enzyme; AmpR, ampicillin resistance gene (bla-TEM

1 ); mCherry, gene for ﬂuorescent protein mCherry;

conjugation genes, encoding the IncP RK2/RP4

conjugation system; GmR, gentamycin resistance gene;

pBBR1 oriV, plasmid origin of replication. (B) FAST-

PETase coding region (not to scale) highlighting the

arabinose-inducible promoter (PBAD), signal peptide

(SPstu) and 6× His-tag. (C) Experimental procedure for

conjugation of pFAST-PETase-cis into bacteria from

a wastewater sample. (D) Experimental procedure

for measuring conjugation eﬃciency in wastewater

suspension.

Image Courtesy:

https://pmc.ncbi.nlm.nih.gov/articles/PMC11420662/

1.3 Conclusion

These advancements highlight the transformative

potential of synthetic biology across diverse ﬁelds,

oﬀering innovative solutions to global challenges

while raising important ethical and regulatory

considerations.The integration of machine learning

and data science is further accelerating progress,

enabling the design of complex biological systems

with unprecedented accuracy and eﬃciency.

1.3 Conclusion

However, as synthetic biology continues to push

the boundaries of what is possible, it also raises

important ethical, social, and regulatory questions. The

ability to engineer life at the molecular level demands

careful consideration of biosafety, biosecurity, and the

equitable distribution of beneﬁts. Collaborative eﬀorts

among scientists, policymakers, and the public will be

essential to ensure that these powerful technologies are

used responsibly and for the greater good.

As we look to the future, synthetic biology holds

the promise of transforming our world in ways that were

once unimaginable. By harnessing the power of biology,

we are not only gaining a deeper understanding of life but

also unlocking innovative solutions to global challenges,

paving the way for a healthier, more sustainable, and

resilient future. The journey ahead is as exciting as

it is complex, and the potential for positive impact is

immense.

References

Doudna, J. A., & Charpentier, E. (2014). Genome

editing. The new frontier of genome engineering with

CRISPR-Cas9. Science (New York, N.Y.), 346(6213),

1258096. https://doi.org/10.1126/science.1258096

Hughes, R. A., & Ellington, A. D. (2017).

Synthetic DNA Synthesis and Assembly: Putting

the Synthetic in Synthetic Biology. Cold Spring

Harbor perspectives in biology, 9(1), a023812.

https://doi.org/10.1101/cshperspect.a023812

Guindani, C., da Silva, L. C., Cao, S.,

Ivanov, T., & Landfester, K. (2022). Synthetic

Cells: From Simple Bio-Inspired Modules to

Sophisticated Integrated Systems. Angewandte Chemie

(International ed. in English), 61(16), e202110855.

https://doi.org/10.1002/anie.202110855

Hwang, I. Y., Koh, E., Wong, A., March, J. C.,

Bentley, W. E., Lee, Y. S., & Chang, M. W. (2017).

Engineered probiotic Escherichia coli can eliminate

and prevent Pseudomonas aeruginosa gut infection in

animal models. Nature communications, 8, 15028.

https://doi.org/10.1038/ncomms15028

Jumper, J., Evans, R., Pritzel, A. et

al. Highly accurate protein structure prediction

with AlphaFold. Nature 596, 583–589 (2021).

https://doi.org/10.1038/s41586-021-03819-2

Pardi, N., Hogan, M., Porter, F. et al.

mRNA vaccines — a new era in vaccinology.

Nat Rev Drug Discov 17, 261–279 (2018).

https://doi.org/10.1038/nrd.2017.243

Keasling J. D. (2010). Manufacturing

molecules through metabolic engineering. Science

(New York, N.Y.), 330(6009), 1355–1358.

https://doi.org/10.1126/science.1193990

Yoshida, S., Hiraga, K., Takehana, T., Taniguchi,

I., Yamaji, H., Maeda, Y., Toyohara, K., Miyamoto,

K., Kimura, Y., & Oda, K. (2016). A bacterium that

degrades and assimilates poly(ethylene terephthalate).

Science (New York, N.Y.), 351(6278), 1196–1199.

https://doi.org/10.1126/science.aad6359

https://pmc.ncbi.nlm.nih.gov/articles/PMC5204324/

https://pmc.ncbi.nlm.nih.gov/articles/PMC9314110/

https://www.nature.com/articles/s41586-021-

03819-2

https://pmc.ncbi.nlm.nih.gov/articles/PMC9611187/

https://pmc.ncbi.nlm.nih.gov/articles/PMC11420662/

About the Author

Geetha Paul is one of the directors of

airis4D. She leads the Biosciences Division. Her

research interests extends from Cell & Molecular

Biology to Environmental Sciences, Odonatology, and

Aquatic Biology.

Part III

Computer Programming

Principal Component Analysis

by Linn Abraham

airis4D, Vol.3, No.3, 2025

www.airis4d.com

1.1 Introduction

Principal Component Analysis is a useful tool

in machine learning for doing feature extraction,

dimensionality reduction and data visualization. The

key idea here is to identify a lower number of features

or dimensions from your actual data. Apart from just

bringing down the size of the data that the machine

learning technique needs to deal with, this also helps

make your results better by removing possible noise in

your data. Remember that the PCA is a linear technique

since the PCA axes are basically some rotated version of

your original axes. The idea of the principal component

is to ﬁnd a direction in your original feature space

where the variation in the data is the maximum. Once

this is done, the second principal component is the

direction which is orthogonal to the ﬁrst and the one

which accounts for the maximum variation in the rest of

the data. This is repeated untill all of the data variation

is accounted for. Let us try to understand this technique

by going through the math.

1.2 Covariance

Variance of a set of numbers is a measure of how

spread out the numbers are. It is computed as the sum

of squared distances between each value and the mean

or expected value.

var({x

}) = E(({x

} − µ)

)

If we have a set of random variables, then for each

variable we end up with a set of numbers corresponding

to how many the variable was measured. Then

computing the variance of each random variable is

pretty straight forward using this deﬁnition. However

when it comes to covariance, we exetend this idea to

two random variables simultaneously. Here we deﬁne

a covariance for two random variables using a similar

distance measure but instead of single squared term we

have a cross term containing the two variables. The

covariance of two variables, so deﬁned, measures how

dependent the two variables are (in the statistical sense).

When we have a data matrix of order

n × m

corresponding to n samples each with m features, by

taking diﬀerent combinations of two variables we end

up with a square symmetric matrix of order

m × m

which is called the covariance matrix. The variances

then appear as the diagonal terms of this covariance

matrix.

cov({x

}, {y

}) = E({x

} − µ)E({y

} − ν)

where

is the mean of set

}

and

is the mean of

set {y

} and E is the expectation operator.

1.3 Eigenvectors

Let us referesh our memory of what eigenvectors

and eigenvalues are. For a square symmetric matrix A,

if there exists a vector V which when operated on by

the matrix results in a scaled version of the vector, then

such a vector is called an eigenvector and the value by

which it is scaled is known as the eigenvalue.

1.4 Principal Components

The objective of doing PCA is to ﬁnd a new set

of basis vectors for representing the data such that in

this new space, the covariance matrix corresponding to

the transformed data matrix becomes a diagonal matrix.

From the spectral theorem we have the following result

for a square symmetric matrix,

A E = D (1.1)

where D is a diagonal matrix made out of the

eigenvalues and E is an orthogonal matrix whose

columns contain the normalized eigenvectors of A.

Now for any transformation matrix P which acts on the

original data matrix X, we have the following result.

cov(P

X) = P

cov(X)P (1.2)

Since we require P to be such that

cov(P

a diagonal matrix, it is clear by comparing the two

previous equations that P should be the matrix made

out of the eigenvectors of the covariance matrix of X.

Since the left hand term in equation 1.1 and the right

hand term in equation 1.2 both represents a change of

basis operation, the interpretation of doing

is that

we are going from the original space to another space

which is deﬁned by the eigenvectors of the covariance

matrix (in the original space).

Since the eigenvalues corresponding to the

eigenvectors are the diagonal elements of your new

covariance matrix, they also represent the data variances

along the eigenvector directions. The eigenvector

corresponding to the largest eigenvalue becomes your

ﬁrst principal component and so on until you get to the

eigenvector with the least eigenvalue which becomes

your last principal component. Here if you see that a

combination of k eigenvectors is enough to pretty much

explain the bulk of the variance in your data set, you

can choose to drop the rest. The original data can be

reconstructed by taking the resultant projections of the

k components on the original axes.

1.5 The PCA algorithm

So now it is time to summarize the PCA algorithm

which is quite simple to implement using Numpy.

Write N datapoints

= (x

, x

, ..., x

M i

)

row vectors.

Put these vectors into a matrix X (which will have

size N × M ).

Centre the data by subtracting oﬀ the mean of

each column, putting it into matrix B.

Compute the covariance matrix C =

Compute the eigenvalues and eigenvectors of C,

−1

CV = D

, where V holds the eigenvectors

of C and D is the M x M diagonal eigenvalue

matrix.

Sort the columns of D into order of decreasing

eigenvalues, and apply the same order to the

columns of V.

Reject those with eigenvalue less than some

leaving K dimensions in the data.

References

[Chatﬁeld and Collins(1980)]

Christopher Chatﬁeld

and Alexander J. Collins. Introduction to

Multivariate Analysis. Springer US, Boston, MA,

1980. ISBN 978-0-412-16030-1 978-1-4899-3184-

9. doi: 10.1007/978-1-4899-3184-9.

[Marsland(2014)]

Stephen Marsland. Machine

Learning: An Algorithmic Perspective. Chapman

and Hall/CRC, 2 edition, October 2014. ISBN

978-0-429-10250-9. doi: 10.1201/b17476.

[Halmos(1974)]

P. R. Halmos. Finite-Dimensional

Vector Spaces. Undergraduate Texts in

Mathematics. Springer-Verlag, 1974. ISBN 0-387-

90093-4 978-0-387-90093-3.

REFERENCES

About the Author

Linn Abraham is a researcher in Physics,

specializing in A.I. applications to astronomy. He is

currently involved in the development of CNN based

Computer Vision tools for prediction of solar ﬂares

from images of the Sun, morphological classiﬁcations

of galaxies from optical images surveys and radio

galaxy source extraction from radio observations.

An Introduction to Parallel Computing

by Ajay Vibhute

airis4D, Vol.3, No.3, 2025

www.airis4d.com

2.1 Introduction

The history of computing is an interesting journey

that started in 1830 with Charles Babbage’s innovative

design of the Analytical Engine, the ﬁrst conceptual

mechanical computer. Over the next century, computing

technology advanced rapidly, leading to the birth of

digital computers in the late 1930s. Among these,

the Atanasoﬀ-Berry Computer (ABC) was one of the

ﬁrst to have an electronic Arithmetic and Logic Unit.

However, it was the ENIAC (Electronic Numerical

Integrator and Computer), completed in 1945, that truly

revolutionized the ﬁeld, becoming the ﬁrst general-

purpose, programmable digital computer. The 1950s

marked a major milestone with the advent of the

UNIVAC (Universal Automatic Computer), the ﬁrst

commercially successful computer, followed by the

TRADIC (Transistor Digital Computer) in 1955, which

introduced transistors into the world of computing.

These innovations laid the foundation for the personal

computing revolution of the 1970s and 1980s.

In the early era of personal computing, single-

core processors such as the Intel 4004 and 8080

were primarily used. These processors could handle

only one instruction at a time, limiting their tasks

to basic, sequential operations. At that time,

parallel computing on personal computers was nearly

impossible due to the lack of adequate hardware,

software, and operating system support. However, the

development of multi-core processors, multi-processor

systems, and improvements in software multitasking

have dramatically changed the computing landscape.

Today, parallel computing is not just feasible but

crucial, enabling modern systems to perform multiple

tasks simultaneously and achieve remarkable gains in

eﬃciency and performance.

This article examines the programming models

that allow us to fully harness the capabilities of modern

hardware. Along the way, we’ll uncover the tools

and techniques that continue to drive innovation in the

ever-evolving ﬁeld of computing.

2.2 Memory Architectures

Memory architecture is one of the key components

of computer design, deﬁning how data is stored

and accessed. Memory architectures play a vital

role in parallel computing, and understanding these

architectures will help optimize system performance

and underlying resource utilization. The memory

architectures are broadly classiﬁed into three categories:

shared memory, distributed memory.

2.2.1 Shared Memory Architecture

Shared memory architecture is a commonly used

memory model in parallel computing, where multiple

processes access the same pool of memory. In such

cases, all processes are tied to a single memory unit,

allowing them to read and write to the same memory

pool. Shared access enables data exchange between

multiple processes in a simple form and eliminates the

need for data transfer over a network. However, as all

processes can access the same memory locations, this

leads to a need for eﬃcient synchronization to prevent

data conﬂicts. Synchronization can be achieved through

the use of locks, semaphores, and barriers before

2.3 Summary

writing to memory locations that are used concurrently.

Although the shared memory model is the simplest

approach to achieve parallelism, it has limitations in

scalability. As the number of processes increases,

contention for memory access can create bottlenecks

and slow down the system.

The shared memory architecture can further be

divided into Uniform Memory Access (UMA) and

Non-Uniform Memory Access (NUMA). In UMA,

all processors share the same memory pool, and

each process has equal access time to each memory

location, ﬁgure 1. While UMA is the simplest model

for facilitating synchronization, it also has limited

scalability.

Figure 1: Uniform Memory Access Architecture

In contrast, in NUMA, each processor is attached to

a local memory, and processors can access the memory

of other processors ﬁgure 2. This allows faster access

times to local memory compared to remote memory,

which ultimately reduces memory access contention and

helps to scale the system. However, scalability comes

at the cost of increased complexity in maintaining

memory consistency. Shared memory parallelism can

Figure 2: Non Uniform Memory Access Architecture

be achieved using POSIX Threads, which provide a set

of APIs to create and manage threads. Alternatively,

one can use OpenMP, which provides several compiler

directives and library methods that mainly concentrate

on shared memory parallelism.

2.2.2 Distributed Memory Architecture

In distributed memory architecture, each processor

in a system owns its dedicated local memory which it

can access without involving other processors, ﬁgure 3.

Unlike shared memory systems, where all processors

access a shared memory pool, distributed memory

systems rely on explicit communication between

processes to exchange data over the network. In this

setup, each process functions independently, using its

private memory, which leads to faster memory access

and better scalability as the system scales just by adding

new processors and associated memory unit without

altering the existing memory structure. Distributed

Figure 3: Distributed Memory Architecture

memory systems can further be classiﬁed into compute

clusters, grid computing, and cloud computing. A

compute cluster consists of several independent nodes,

where each node has its own memory. These nodes

are connected using a high-bandwidth, low-latency

network, such as InﬁniBand, to communicate with

each other using Message Passing Interface (MPI) and

exchange data. Grid computing systems primarily

contain geographically distributed resources, mostly

using private networks for communication. Each

resource in the grid is an independent system. In

a cloud computing environment, distributed memory

architecture is used to provide on-demand resources.

Whenever there is a resource demand, compute nodes

are added to the resource pool and remain available

until the demand is fulﬁlled.

2.3 Summary

In summary, both shared memory and distributed

memory architectures are important in the parallel

computing domain, each with unique strengths and

challenges. Shared memory architecture simpliﬁes

communication between processes by providing a single,

2.3 Summary

uniﬁed memory pool accessible to all processors. It

also simpliﬁes synchronization and allows for faster

data exchange within a single system. However,

it has limitations in scalability as the number of

processors grows, leading to memory contention.

Robust synchronization mechanisms are essential to

avoid memory conﬂicts and ensure that each memory

location is accessed by only one process. On the

other hand, distributed memory architecture improves

scalability and fault tolerance by giving each processor

its own memory. This design allows systems to scale

more eﬀectively and manage large-scale tasks. However,

it relies on explicit communication between processors,

which increases programming complexity and can

introduce communication overhead. Eﬃcient message-

passing protocols, such as MPI, are critical for enabling

seamless data exchange in distributed environments.

The choice between shared memory and distributed

memory ultimately depends on the needs of the

application, the required scalability, performance, and

available resources. For many high-performance and

large-scale systems, a hybrid approach that combines

both architectures is often implemented to leverage

the strengths of each architecture, allowing optimal

resource utilization and system eﬃciency.

About the Author

Dr. Ajay Vibhute is currently working

at the National Radio Astronomy Observatory in

the USA. His research interests mainly involve

astronomical imaging techniques, transient detection,

machine learning, and computing using heterogeneous,

accelerated computer architectures.

Understanding Cosine Similarity: A

Mathematical Perspective

by Jinsu Ann Mathew

airis4D, Vol.3, No.3, 2025

www.airis4d.com

Mathematical concepts have a remarkable way

of transcending disciplines, seamlessly connecting

seemingly unrelated ﬁelds. One such concept is

cosine similarity—a tool that has proven invaluable

in modern data science, yet its foundations trace back

to fundamental principles that have been explored

for centuries. Whether analyzing language patterns,

building recommendation systems, or detecting

anomalies, cosine similarity oﬀers an elegant approach

to measuring relationships in high-dimensional spaces.

At its core, cosine similarity is a measure of

how two entities align, disregarding diﬀerences in

scale and focusing purely on their direction. This

makes it particularly powerful in applications

where relationships matter more than raw

magnitudes—whether comparing documents,

identifying similar user preferences, or clustering

complex datasets.

3.1

Mathematical Formula for Cosine

Similarity

Cosine similarity is a fundamental concept in

data science and machine learning, used to measure the

similarity between two vectors based on the cosine of the

angle between them. Unlike traditional distance-based

metrics, cosine similarity focuses on the orientation of

vectors rather than their magnitude. It ranges from -1

to 1, where 1 indicates that the vectors are identical

in direction, 0 means they are completely unrelated

(orthogonal), and -1 signiﬁes that the vectors point in

opposite directions. This characteristic makes cosine

similarity particularly useful in applications where the

relative direction of data points matters more than their

absolute values.

Mathematically, cosine similarity is calculated

using the dot product of two vectors divided by the

product of their magnitudes. The formula is expressed

as:

cos(θ) =

∥A∥∥B∥

A ⋅ B

(3.1)

In this equation,

A ⋅ B

represents the dot product of the

vectors, which sums the element-wise multiplication of

their components. The terms

∥A∥and∥B∥

denote the Euclidean magnitudes (or norms) of the

vectors, ensuring that the similarity measure remains

independent of vector scale. By normalizing the dot

product with the magnitudes, cosine similarity captures

the degree of alignment between two vectors while

disregarding their absolute lengths.

This property is particularly signiﬁcant in high-

dimensional vector spaces, where traditional distance

measures like Euclidean distance may become less

meaningful. In ﬁelds such as natural language

processing (NLP) and information retrieval, text

documents and words are often represented as vectors

in a multi-dimensional space.

3.2 Geometric Interpretation

The geometric interpretation of cosine similarity

further highlights its signiﬁcance. By measuring

the angle between vectors rather than their absolute

diﬀerence, it provides a more intuitive understanding

of relationships in vector space. When two vectors

are closely aligned, the cosine value approaches 1,

indicating high similarity. As the angle increases, the

cosine value moves toward 0, signifying lower similarity.

If the vectors point in opposite directions, the similarity

becomes -1, meaning complete dissimilarity.

This concept is deeply connected to how vectors

are used in physics. Imagine two forces acting on an

object—when they are applied in the same direction,

their eﬀects reinforce each other, just as a cosine

similarity of 1 indicates identical vectors. If they are

perpendicular, they have no direct inﬂuence on each

other, similar to a cosine similarity of 0. When forces

act in opposite directions, they cancel each other out,

resembling a cosine similarity of -1. Just as physicists

use angles between vectors to determine resultant forces

or interactions, data scientists use cosine similarity to

quantify relationships in high-dimensional spaces.

3.3 Cosine Similarity in Data Science

Cosine similarity is widely used in data science

for measuring the similarity between data points

represented as vectors. This metric is particularly

valuable in high-dimensional spaces, where traditional

distance-based measures like Euclidean distance may

not be eﬀective. Cosine similarity focuses solely on

the direction of the vectors rather than their magnitude,

making it ideal for applications such as text analysis,

recommendation systems, and clustering algorithms.

One of the most common use cases of cosine

similarity is in natural language processing (NLP),

where documents or sentences are converted into vector

representations using techniques like TF-IDF (Term

Frequency-Inverse Document Frequency) or word

embeddings (e.g., Word2Vec, BERT). The similarity

between two documents can then be assessed by

computing the cosine of the angle between their

(Image courtesy:https://ai.plainenglish.io/understanding-ai-similarity-search-8548912203a6)

Figure 1: Vector Similarity

respective vectors.To better understand how cosine

similarity functions in document comparison, consider

three key cases Figure( 1):

1) Similar Documents (Cosine Similarity = 1)

When two document vectors are nearly aligned, the

cosine similarity is close to 1, indicating high similarity.

This means the documents share many common words

or have similar thematic content. In applications

like plagiarism detection, a high cosine similarity

suggests that two documents are highly alike, potentially

containing overlapping information. Similarly, in search

engines, documents with high similarity to a query are

ranked higher in search results.

2) Unrelated (Orthogonal) Documents (Cosine

Similarity = 0)

When two vectors are perpendicular, their cosine

similarity is 0, indicating no meaningful relationship

between them. In text analysis, this suggests that the

documents have little to no common terms or concepts.

For example, an article about ”machine learning” and

another about ”classical music” might have a near-zero

cosine similarity, as they contain distinct vocabularies

with minimal overlap. In recommendation systems,

two users with completely diﬀerent preferences would

have orthogonal vectors, meaning their behaviors do

not inﬂuence each other’s recommendations.

3) Opposite Documents (Cosine Similarity = -1)

When two document vectors point in exactly

opposite directions (180-degree angle), their cosine

similarity is -1, indicating complete dissimilarity or

even contradiction. This scenario is uncommon in

general text analysis but can be useful in sentiment

classiﬁcation. For instance, a review with extremely

positive language (e.g., ”excellent, outstanding, highly

recommend”) would have a cosine similarity close to

-1 when compared to a review with extremely negative

3.4 Understanding Cosine Similarity Through a Practical Example

language (e.g., ”terrible, worst, do not buy”). This

distinction helps in identifying conﬂicting opinions and

polarizing viewpoints in data.

Cosine similarity also plays a crucial role in

recommendation systems. In platforms like Netﬂix or

Amazon, users’ preferences are represented as vectors,

where each dimension corresponds to a product or

movie rating. By calculating cosine similarity between

users’ rating vectors, the system can suggest items

based on the preferences of similar users. If two users

have a high similarity score, recommendations can be

made by suggesting items liked by one user to the other.

This method enhances personalized recommendations

without needing direct overlap in purchases or views.

In computer vision, cosine similarity is used in

image recognition and classiﬁcation. Deep learning

models, such as FaceNet, encode images into feature

vectors in high-dimensional space. When identifying

a face, the model compares the vector representation

of the new image with stored representations. A high

cosine similarity score indicates that the two images

likely belong to the same person. This technique is

widely used in security applications, such as facial

recognition systems for unlocking smartphones or

verifying identity at airports.

Another major application of cosine similarity

is in clustering and anomaly detection. In clustering

algorithms like K-means, cosine similarity helps group

data points that are directionally similar. This is

particularly useful in customer segmentation, where

businesses categorize customers based on shopping

patterns. In fraud detection, cosine similarity helps

detect unusual transactions. Regular transactions

form clusters with high similarity, while fraudulent

transactions appear as outliers with low similarity

scores.

3.4 Understanding Cosine Similarity

Through a Practical Example

To grasp the concept of cosine similarity in a real-

world setting, let’s consider an example in text analysis.

Imagine we have three short documents:

Document A: ”Machine learning is a branch of

artiﬁcial intelligence.”

Document B: ”Deep learning is a subﬁeld of

machine learning.”

Document C: ”Classical music compositions

follow structured patterns.”

We want to determine how similar these documents

are to each other using cosine similarity.

Converting Text into Vectors

Since computers do not process raw text, we ﬁrst

convert each document into a vector representation.

One common method is the Bag of Words (BoW) or TF-

IDF (Term Frequency-Inverse Document Frequency)

model, where words are treated as features in a vector

space.

For simplicity, let’s assume we build a feature

space using the key terms: [”machine”, ”learning”,

”artiﬁcial”, ”intelligence”, ”deep”, ”subﬁeld”, ”music”,

”compositions”, ”structured”, ”patterns”].

Now, each document can be represented as a vector

based on the presence of these words:

Vector A: [1, 1, 1, 1, 0, 0, 0, 0, 0, 0]

Vector B: [1, 1, 0, 0, 1, 1, 0, 0, 0, 0]

Vector C: [0, 0, 0, 0, 0, 0, 1, 1, 1, 1]

Calculating Cosine Similarity

By computing cosine similarity:

Similarity(A, B) = 0.67. Documents A and B are

quite similar, as they discuss machine learning-related

topics. Their cosine similarity of 0.67 indicates a strong

relationship.

Similarity(A, C) = 0. Documents A and C are

completely diﬀerent, as one discusses machine learning

and the other discusses music. Their cosine similarity

is 0, meaning they are orthogonal.

Similarity(B, C) = 0. Documents B and C are also

unrelated, reinforcing that cosine similarity eﬀectively

captures semantic relationships.

3.5 Conclusion

Cosine similarity is a powerful mathematical tool

for measuring the similarity between vectors, making it

widely applicable across various ﬁelds. By leveraging

the cosine of the angle between vectors rather than their

magnitudes, it provides a robust measure of similarity

that remains unaﬀected by diﬀerences in scale.

From a mathematical perspective, cosine similarity

is based on the dot product and the magnitude of

vectors, ensuring that only the directional alignment of

data points is considered. Its geometric interpretation

further enhances its utility, as it enables an intuitive

understanding of relationships in vector space—where

similarity corresponds to closely aligned vectors,

dissimilarity to orthogonal vectors, and opposition to

vectors pointing in opposite directions.

In data science, cosine similarity plays a

crucial role in tasks such as document retrieval,

recommendation systems, and text clustering, helping to

identify meaningful relationships in high-dimensional

data. A practical example demonstrates how the

measure can be applied to compare documents, showing

how shared terms inﬂuence similarity scores.

Ultimately, cosine similarity serves as a

fundamental metric in machine learning, natural

language processing, and numerous analytical

applications. Its ability to quantify similarity eﬃciently

makes it a vital tool for handling vast amounts of data,

reinforcing its signiﬁcance in computational and real-

world contexts.

References

Understanding AI Similarity Search

From physics to data science: the beauty and

power of cosine similarity

The Role of Cosine Similarity in Vector Space

What is Cosine Similarity: A Comprehensive

Guide

Unveiling the Power of Cosine Similarity in Text

Analysis

About the Author

Jinsu Ann Mathew is a research scholar

in Natural Language Processing and Chemical

Informatics. Her interests include applying basic

scientiﬁc research on computational linguistics,

practical applications of human language technology,

and interdisciplinary work in computational physics.

About airis4D

Artiﬁcial Intelligence Research and Intelligent Systems (airis4D) is an AI and Bio-sciences Research Centre.

The Centre aims to create new knowledge in the ﬁeld of Space Science, Astronomy, Robotics, Agri Science,

Industry, and Biodiversity to bring Progress and Plenitude to the People and the Planet.

Vision

Humanity is in the 4th Industrial Revolution era, which operates on a cyber-physical production system. Cutting-

edge research and development in science and technology to create new knowledge and skills become the key to

the new world economy. Most of the resources for this goal can be harnessed by integrating biological systems

with intelligent computing systems oﬀered by AI. The future survival of humans, animals, and the ecosystem

depends on how eﬃciently the realities and resources are responsibly used for abundance and wellness. Artiﬁcial

intelligence Research and Intelligent Systems pursue this vision and look for the best actions that ensure an

abundant environment and ecosystem for the planet and the people.

Mission Statement

The 4D in airis4D represents the mission to Dream, Design, Develop, and Deploy Knowledge with the ﬁre of

commitment and dedication towards humanity and the ecosystem.

Dream

To promote the unlimited human potential to dream the impossible.

Design

To nurture the human capacity to articulate a dream and logically realise it.

Develop

To assist the talents to materialise a design into a product, a service, a knowledge that beneﬁts the community

and the planet.

Deploy

To realise and educate humanity that a knowledge that is not deployed makes no diﬀerence by its absence.

Campus

Situated in a lush green village campus in Thelliyoor, Kerala, India, airis4D was established under the auspicious

of SEED Foundation (Susthiratha, Environment, Education Development Foundation) a not-for-proﬁt company

for promoting Education, Research. Engineering, Biology, Development, etc.

The whole campus is powered by Solar power and has a rain harvesting facility to provide suﬃcient water supply

for up to three months of drought. The computing facility in the campus is accessible from anywhere through a

dedicated optical ﬁbre internet connectivity 24×7.

There is a freshwater stream that originates from the nearby hills and ﬂows through the middle of the campus.

The campus is a noted habitat for the biodiversity of tropical Fauna and Flora. airis4D carry out periodic and

systematic water quality and species diversity surveys in the region to ensure its richness. It is our pride that the

site has consistently been environment-friendly and rich in biodiversity. airis4D is also growing fruit plants that

can feed birds and provide water bodies to survive the drought.