Cover page
Species Name: Crocothemis servilia.
Crocothemis servilia is commonly known as scarlet skimmer, and belongs to the Order Odonata, Family
Libellulidae. In this picture Crocothemis servilia is in Obelisk posture. The ”obelisk posture” refers to a
handstand-like position adopted by certain species of dragonflies and damselflies in order to regulate their body
temperature and prevent overheating on sunny days. This unique behavior involves the elevation of the abdomen
until its apex is oriented toward the sun, thereby reducing the amount of body surface exposed to direct solar
radiation. This alignment becomes most pronounced when the sun is positioned almost directly overhead,
creating a visual resemblance to an obelisk-like structure.
Managing Editor Chief Editor Editorial Board Correspondence
Ninan Sajeeth Philip Abraham Mulamootil K Babu Joseph The Chief Editor
Ajit K Kembhavi airis4D
Geetha Paul Thelliyoor - 689544
Arun Kumar Aniyan India
Jorunal Publisher Details
Publisher : airis4D, Thelliyoor 689544, India
Website : www.airis4d.com
Email : nsp@airis4d.com
Phone : +919497552476
i
Editorial
by Fr Dr Abraham Mulamoottil
airis4D, Vol.1, No.9, 2023
www.airis4d.com
The 9th edition of airis4D Journal must not overlook Indias recent strides in its space program. The voyage
to the moon began with Chandrayaan-1’s launch in 2008, marking a significant milestone in Indias space journey.
This mission proved pivotal, marking Indias first successful endeavour beyond Earth. Notably, Chandrayaan-
1’s breakthrough discovery of water molecules on the moons surface transformed scientists perception of our
closest cosmic neighbour. The missions orbiter played a pivotal role, opening doors for forthcoming missions
to investigate lunar resources and delve into the moons composition. Taking inspiration from Chandrayaan-1’s
success, India embarked on the ambitious Chandrayaan-2 mission in 2019. This mission aimed to expand lunar
knowledge by endeavouring to land a rover on the moons surface softly. Although the lander experienced
a challenging touchdown, the orbiter continued its lunar orbit, beaming back valuable data. Chandrayaan-2
underlined Indias prowess in executing intricate space manoeuvres and brought it closer to achieving a delicate
lunar landing. A groundbreaking achievement arrived in August 2023 with India’s most recent lunar endeavour,
Chandrayaan-3. It accomplished a triumphant landing near the lunar south pole, firmly establishing Indias status
as a prominent figure in the global space arena. Beyond the successful touchdown, Chandrayaan-3 showcased
Indias unwavering determination and cutting-edge space technology, solidifying its position as a trailblazer.
The missions success demonstrated prowess and underscored Indias commitment to exploring uncharted lunar
territories. These lunar expeditions encompass more than scientific quests; they encapsulate Indias aspirations
to transcend conventional boundaries in space exploration. They have yielded invaluable insights into lunar
geology and composition, laying the groundwork for future lunar endeavours and potentially even setting the
stage for human exploration. As India consistently invests in space technology, its lunar endeavours inspire its
citizens and the global community. These missions are not just contributions to scientific knowledge but also
sources of pride and curiosity that transcend geographical borders. The Indian Space Research Organisation
(ISRO) has notched remarkable milestones in space exploration, spotlighting Indias prowess in the field.
However, persistent Western scepticism and condescension regarding Indias simultaneous investment in space
and efforts to tackle domestic challenges like poverty remain a recurring narrative. Critics question resource
allocation, suggesting funds could be better applied elsewhere. India’s retort to these concerns underscores its
position as the world’s fifth-largest economy and emphasises its accomplishments. It underscores that economic
growth and technological advancement are paralleled by initiatives addressing societal development. Indias
accomplishments in space challenge colonial narratives of inferiority and echo its ascent as a global force. These
successes resonate with historical contexts, countering economic exploitation during British colonisation and
spotlighting Indias determination to reshape its narrative on a global stage. Evidenced by India’s successful
lunar landings, claims of fabrication that have plagued other space programs are effectively dispelled. ISRO’s
transparent sharing of scientific data validates the authenticity of Indias achievements and underscores its
dedication to credible space exploration. This ongoing debate underscores the significance of a nation shaping
its narrative, sidestepping external manipulation and misrepresentation. This case reminds us that controlling
narratives is pivotal to presenting a balanced and precise portrayal of achievements and challenges. It reinforces
that narrative control is vital in avoiding external biases and misconceptions. In conclusion, Indias journey is
multifaceted, navigating intricate paths toward progress. A common misconception often arises: Can a country
effectively pursue its development agenda while addressing poverty? The answer, as India has demonstrated,
is a resounding yes. Indias development agenda and poverty alleviation projects are not opposing forces;
they are two sides of the same coin, working to uplift the nation. These two facets synergise, ultimately
lifting the nation. This edition of airis4D explores firstly the ”Difference Boosted Neural Network” (DBNN)
architecture and its extension called ”Enhanced Difference Boosted Neural Network” (E-DBNN). The approach
enhances performance compared to traditional methods like Naive Bayes. The author, Blesson George, shares
E-DBNN’s Python code on GitHub and focuses on machine learning algorithms for protein studies. The second
article discusses text summarisation techniques in ”From Information Overload to Clarity: The Power of Text
Summarization (Part 2)” by Jinsu Ann Mathew. It covers extractive and abstractive summarisation methods.
Extractive summarisation selects essential sentences from the source text to form a summary, while abstractive
summarisation generates new sentences that capture the essence. Abstractive methods include structure-based,
maintaining original structure, and semantic-based, creating new sentences with similar meanings. The article
equips readers to understand different summarisation strategies and their applications. The third article, ”Guide
to Practical Machine Learning for Astronomy - Part I” by Linn Abraham, provides a practical guide for those
in a scientific background interested in entering machine learning. It outlines the stages of a machine learning
project, covering technical setup, programming languages, and code version control. The article also introduces
resources for learning Python, machine learning courses, and using GitHub. It emphasises the importance of
version control using Git and highlights helpful resources for learning and development. The fourth article,
”The Hertzsprung-Russell Diagram: Exploring Stellar Evolution and Diversity” by Robin Jacob Roy, introduces
the Hertzsprung-Russell (HR) diagram, a fundamental tool in astronomy. The diagram categorises stars based
on luminosity, temperature, spectral type, and evolutionary stage, revealing insights into their properties and
life cycles. It discusses the main features of the HR diagram, such as luminosity, temperature, spectral type, and
evolutionary stage. It explains its use in understanding different types of stars, including main sequence stars,
red giants, supergiants, white dwarfs, and more. The HR diagram is a crucial tool for comprehending stellar
evolution and diversity. The fifth article, ”X-ray Binaries” by Sindhu G, discusses X-ray binaries, a category
of binary star system containing a compact object (neutron star or black hole) and a companion star. These
binaries emit X-ray radiation due to material accumulation onto the compact object, often through processes like
Roche lobe overflow. They can be classified into low-mass X-ray binaries (LMXBs), high-mass X-ray binaries
(HMXBs), and intermediate-mass X-ray binaries (IXRBs) based on the mass of the companion star. LMXBs
involve low-mass stars transferring material through Roche lobe overflow, HMXBs consist of massive stars with
strong stellar winds, and IXRBs feature intermediate-mass stars as donors. These binaries provide insights into
accretion processes and interactions between stars and compact objects, contributing to our understanding of the
universe. The sixth article, ”Microflora of the Intestine,” by Geetha Paul, discusses the complex gut microbiota
ecosystem, which consists of over 400 to 1000 bacterial species in the human digestive tract. It highlights how
this ecosystem impacts digestion, nutrient absorption, immune modulation, and protection against pathogens.
Factors like diet, genetics, and environment influence the microbiota composition. Dysbiosis, an imbalance in
the microbiota, is linked to various health conditions. The article also covers advanced techniques, such as DNA
sequencing and metabolomics, used to study and understand the gut microbiotas roles in maintaining health and
iii
causing diseases. The last article, ”Understanding Convolutional Neural Networks” by Ninan Sajeeth Philip,
explains the significance and functioning of Convolutional Neural Networks (CNNs). CNNs are a class of neural
networks well-suited for grid-like data, particularly images. The article delves into how CNNs automatically
generate features from images and discusses their architecture, including convolutional and pooling layers. It
also covers key concepts such as activation functions, strides, padding, and flattening. The article provides a
code example using TensorFlow and Keras to create a simple CNN model for image classification using the
CIFAR-10 dataset.
iv
Contents
Editorial ii
I Artificial Intelligence and Machine Learning 1
1 Difference Boosted Neural Network(DBNN) - Part 4 2
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Multiple feature connections as likelihood . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Code Availability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2 From Information Overload to Clarity: The Power of Text Summarization (Part 2) 5
2.1 Construction of an intermediate representation of the input text . . . . . . . . . . . . . . . . . 6
2.2 Scoring the sentences based on the representation . . . . . . . . . . . . . . . . . . . . . . . 6
2.3 Selection of a summary comprising several sentences . . . . . . . . . . . . . . . . . . . . . . 7
2.4 Structure-Based Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.5 Semantic-Based Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
II Astronomy and Astrophysics 10
1 Guide to Practical Machine Learning for Astronomy - Part I 11
1.1 Stages of a Machine Learning Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.2 Setting up your computer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.3 The programming setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.4 Code version control and Github . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.5 Where to get help? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2 The Hertzsprung-Russell Diagram: Exploring Stellar Evolution and Diversity 15
2.1 Key Features of the HR Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.2 Various Objects on the HR Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3 A Journey into the Enigmatic Universe of X-ray Binaries 19
3.1 X-ray binaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.2 Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
III Biosciences 23
1 Microflora of the intestine 24
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
1.2 Composition of the normal gut biota . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
1.3 Factors that contribute to the composition of the typical gut microbiota . . . . . . . . . . . . 25
1.4 Current methods to study gut microbiota . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
CONTENTS
1.5 Bioinformatics Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
1.6 Metabolomics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
1.7 Integrated Data Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
IV Computer Programming 30
1 Understanding Convolutional Neural Networks 31
vi
Part I
Artificial Intelligence and Machine Learning
Difference Boosted Neural Network(DBNN) -
Part 4
by Blesson George
airis4D, Vol.1, No.9, 2023
www.airis4d.com
1.1 Introduction
In the preceding discussions, we have thoroughly explored a range of distinct aspects and applications
associated with the DBNN (Difference Boosted Neural Network) architecture. However, the scope of this
current article pivots towards an intriguing extension of this network paradigm, specifically denoted as the
”Enhanced Difference Boosted Neural Network” or E-DBNN for brevity.
We are placing significant emphasis on two fundamental principles that underlie the DBNN framework.
These principles are the imposition of conditional independence and the utilization of difference boosting
technique. Notably, the core necessity of a Naive Bayes classifier, which hinges on the concept of conditional
independence, is effectively satisfied within the DBNN framework through the implementation of imposed
conditional independence mechanisms.
Within the framework of DBNN, the fulfillment of conditional independence is achieved by utilizing class-
based joint probabilities as derived features. On the other hand, in the context of E-DBNN, the derived feature
is constructed by considering the joint occurrence of multiple features within distinct bins corresponding to
different classes. This approach serves to firmly establish the notion of conditional independence.
This underlying assumption draws upon the notion that, in the majority of classification scenarios, there
exists an overlap among several features. It is the features that contribute to discrimination that create distinctions
between diverse entities. By amplifying these feature values, a more nuanced and insightful outcome can be
derived compared to the analysis of linear features alone.
1.2 Multiple feature connections as likelihood
Bayesian Classifier networks, built upon the foundational principles of Bayes theorem, undertake the task
of classifying provided data denoted as x by employing the following equation:
C(x) = argmax(P (c) × P (x
1
, x
2
, x
3
..., x
m
|c)
where C(x) represent the class variable. Considering the conditional independence, the above equation becomes,
C(x) = argmax(P (c) × {P (x
1
|c) P (x
2
|c) P (x
3
|c)...P (x
m
|c)}
In the DBNN approach, a different strategy is employed. Here, the features undergo a discretization
1.2 Multiple feature connections as likelihood
Figure 1
process, wherein the feature values are categorized into distinct bins. Subsequently, the feature values within
these bins are sorted into class-specific categories. Rather than directly utilizing the original feature values,
the approach involves the utilization of histograms containing feature values from various bins, each aligned
with different classes. This histogram-based representation is then employed as the likelihood component in the
classification process.
The likelihood for a conditionally dependent event A can be approximated as the product of the likelihood
of its paired input features.
L(A|b, c, d, e) = L(A|b, c) × L(A|b, d) × L(A|b, e) × L(A|c, d) × L(A|c, e) × L(A|d, e)
In the context of E-DBNN, the computation of likelihood takes the form of a histogram capturing multiple
feature interactions. Following the discretization process, the feature vector within the training data is represented
as a two-dimensional grid denoted as F
i,m
. Here, the rows correspond to distinct bins (m), while the columns
represent individual features (i). Consequently, in a dataset comprising n feature vectors, there emerge n
two-dimensional structures.
The focus then shifts towards examining the interconnections between these features. The resulting
histogram of these connections assumes the role of the conditional probability distribution, forming an integral
part of the likelihood computation process.
Consider an input dataset where the value of Feature A is placed within the first bin, Feature B’s value falls
into the third bin, and Feature C’s value is categorized into the fourth bin. This relationship can be symbolically
represented as ζ(A 1B
3
C
4
). Furthermore, if there exists a fourth feature, Feature D, and its value is assigned
to the second bin, another conceivable connection emerges, denoted as ζ(B
3
C
4
D
2
).
In the scenario where two feature connections are being examined, they are denoted as ζ(A
1
B
3
), ζ(B
3
C
4
),
and ζ(C
4
D
2
). These connections can be generally indicated as F
i+j,m
where ”j” varies within the range from
1 to N
c
. N
c
denotes the number of features involved in the connection.
The quantity of features amalgamated to establish a connection is defined as a hyper-parameter of the
network. The choice of this parameter is based on the observation that as features are binned, their conditional
interdependence on other features gradually coalesces into a few connections.
Throughout the training phase, all such connections within the data are quantified and represented as
C
F
ij,m
. Furthermore, if the data encompasses N classes, similar connections can be constructed for each of
3
1.3 Code Availability
those classes.
For each class, these connections contribute to the establishment of conditional probabilities that define
the likelihood of features residing within specific bins.
The posterior value is approximated as the product of the likelihood values multiplied with prior weights
of all the con- nection involved in the feature vector.
1.3 Code Availability
The source code of E-DBNN in Python language is available at the github page - https://github.com/blessoncms/E-
DBNN
1.4 Summary
E-DBNN is characterized as a Bayesian classifier that attains enhanced performance outcomes through
the strategic utilization of imposed conditional independence and difference boosting methodologies. By
conducting a comparative analysis, we have evaluated the accuracy of E-DBNN against alternative approaches
grounded in Naive Bayes principles. The findings of this comparison indicate a consistent and systematic
enhancement in the obtained results, thereby underscoring the efficacy of the E-DBNN approach.
About the Author
Blesson George is currently working as Assistant Professor of Physics at CMS College Kottayam,
Kerala. His research interests include developing machine learning algorithms and application of machine
learning techniques in protein studies.
4
From Information Overload to Clarity: The
Power of Text Summarization (Part 2)
by Jinsu Ann Mathew
airis4D, Vol.1, No.9, 2023
www.airis4d.com
In the last article, we explored different techniques for condensing text. We focused on two key aspects:
selecting what to summarize based on the type of input and the intended goal. Now, in this article, we are
shifting our focus to the last type of summarization, which centers on the desired output. This comprises two
methods: extractive summarization and abstractive summarization (Figure 1).
Extractive summarization involves selecting and extracting existing sentences or phrases directly from the
source text to create a summary. These selected sentences are usually the most important, informative, or
representative ones from the original text. This approach doesn’t involve generating new sentences; instead, it
focuses on choosing the most relevant parts that maintain the essence of the original content. Think of it as a
process of ”copying and pasting” significant portions from the source text to create a summary.
Abstractive summarization is a more advanced approach that involves generating new sentences that may not
exist in the source text. Instead of simply copying sentences, abstractive summarization involves understanding
the content of the source text and using that understanding to create concise and coherent summaries in a
more human-like manner. This method requires natural language generation techniques and often involves
paraphrasing and rephrasing to capture the core ideas while using different words and structures.
In essence, extractive summarization is like selecting puzzle pieces from the original picture, while
abstractive summarization is like creating a new puzzle using the same theme and colors but rearranging the
(image courtesy: https://www.abstractivehealth.com/extractive-vs-abstractive-summarization-in- healthcare)
Figure 1: Extractive and Abstractive summarization
2.1 Construction of an intermediate representation of the input text
pieces to form a coherent and concise representation. Both approaches have their strengths and challenges, and
researchers continue to explore ways to improve their effectiveness and efficiency. In this article, we will closely
examine these two methods.
Basic tasks of extractive summarization
Extractive summarization systems are a type of text summarization that selects a subset of sentences from
the input text to form a summary. They are centered on three core operations: 1) Construction of an intermediate
representation of the input text, 2) Scoring the sentences based on the representation, 3) Selection of a summary
comprising several sentences. In the following section, we will discuss each of these operations in more detail.
2.1 Construction of an intermediate representation of the input text
When constructing an intermediate representation of the input text for extractive summarization, the goal is
to transform the textual content into a structured format that captures the essential information of each sentence.
This representation serves as a foundation for subsequent steps like scoring and selection. To achieve this, each
sentence from the tokenized text is assigned a numerical representation. This is often accomplished through the
use of word embeddings, such as Word2Vec or GloVe, which convert words into multi-dimensional vectors in a
semantic space. By averaging or combining these word vectors, sentence representations are formed, capturing
the semantic context of the sentence’s content.
Once individual sentence representations are established, they are combined or aggregated to generate
a unified representation of the entire input text. Various methods can be employed for this aggregation,
including simple averaging of sentence vectors, weighted averages that consider sentence importance, or more
complex mathematical techniques like Principal Component Analysis (PCA) or clustering. This aggregated
representation transforms the textual content into a structured data format, such as a matrix, where each row
corresponds to a sentence and each column represents different dimensions or features.
In essence, the creation of the intermediate representation involves converting the textual information into
a structured numerical format that captures semantic meaning. This representation enables effective analysis
of the source text and plays a vital role in generating meaningful and coherent extractive summaries. It serves
as the intermediary step that bridges the gap between the raw textual data and the subsequent computational
processes required for extractive summarization.
2.2 Scoring the sentences based on the representation
This step involves assessing the significance and relevance of each sentence in the source text through
the use of numerical values derived from its transformed representation.The scores assigned to sentences are
determined by a range of metrics and methodologies. One such metric is TF-IDF (Term Frequency-Inverse
Document Frequency), which gauges the importance of a word within a sentence relative to its frequency in the
entire document. Sentences containing rare but significant terms can thus attain higher scores. Additionally,
semantic analysis techniques come into play, leveraging natural language processing and machine learning
algorithms to comprehend the contextual import of sentences in relation to the broader text.
Positional and structural elements also contribute to scoring. Introductory sentences or conclusive remarks
may carry greater weight due to their positions within the text’s narrative flow. Furthermore, structural cues
6
2.3 Selection of a summary comprising several sentences
such as headings might influence the scoring process. Weighted scoring, where specific features receive varying
degrees of importance, allows terms of exceptional significance to hold more weight in the final score calculation.
Ultimately, the scoring of sentences based on their representation serves as a compass, guiding the
summarization system toward the selection of sentences that collectively construct a coherent and concise
summary reflective of the source texts main content.
2.3 Selection of a summary comprising several sentences
The summarization system picks out the most important sentences to create a summary. Some methods
choose these sentences using step-by-step strategies, always going for the most crucial ones based on their
scores. Others approach the task like solving a puzzle, aiming to find a group of sentences that make the
summary both important and clear while avoiding repetition.
One way to do this is by using greedy algorithms, which means the system goes through sentences one by
one, selecting the most important until it reaches the desired number of sentences, often represented as ’k’. This
way, it builds the summary step by step, making efficient decisions along the way.
On the other hand, another approach treats sentence selection as a kind of puzzle-solving. The system tries
to find the best combination of sentences that not only makes the summary informative and coherent but also
keeps it from being repetitive. This method often uses math and graphs to figure out the best mix of sentences
that work well together.
Overall, the goal is to create a summary that captures the main points, reads smoothly, and doesnt repeat
itself. Its about finding the right balance using clever strategies to build a summary that mirrors the original
text’s essence.
Abstractive summarization
Now, let’s dive into the world of abstractive summarization. This is where things get interesting because
we’re not just copying sentences anymore. Abstractive summarization allows us to express ourselves more
creatively while making sure we still catch the main ideas.We’ll look at two different ways to do this: one that
focuses on keeping the structure of the original text (Structure based approach) and another thats all about
understanding the meaning and expressing it in a fresh way( semantic based approach) (Figure 2). its like
exploring two paths to create summaries that stand out and convey the important stuff in a cool new style.
2.4 Structure-Based Approach
The structure-based approach in abstractive summarization emphasizes the preservation of the underlying
structure of the source text. This method aims to generate a summary that not only captures the main ideas but
also follows a coherent narrative flow similar to the original text. It often involves reorganizing and paraphrasing
sentences while maintaining the logical and sequential arrangement of ideas.
In this approach, the summarization system first analyzes the structural elements of the source text, such
as headings, subheadings, paragraph divisions, and transitions between sections. Then, it employs techniques
like syntactic analysis and grammar parsing to understand how sentences are interconnected. The system might
generate sentences that are not verbatim repetitions from the source text but capture the essence and flow of the
content.
7
2.5 Semantic-Based Approach
(image courtesy: https://www.researchgate.net/figure/Overview-of-Abstractive- Summarization-This- paper-collectively- summarizes-the- major fig4 305912913)
Figure 2: Two types of abstractive summarization
2.5 Semantic-Based Approach
The semantic-based approach in abstractive summarization focuses on understanding the meaning of the
source text and generating summaries that convey this meaning using different words and phrasing. This method
goes beyond surface-level paraphrasing and aims to generate novel sentences that capture the same semantic
information as the original text.
In this approach, the summarization system employs natural language understanding techniques, such as
deep learning models and neural networks. These models are trained on large amounts of text data to learn the
relationships between words, phrases, and concepts. The system uses this learned understanding to generate
coherent and contextually appropriate sentences that express the key ideas of the source text. Semantic-based
summarization often involves rephrasing, word substitution, and even generating entirely new sentences to
convey the intended meaning.
To summarize, the structure-based approach focuses on maintaining the original texts structural organi-
zation while rephrasing and rearranging sentences for coherence. The semantic-based approach, on the other
hand, aims to capture the underlying meaning of the text and generates novel sentences that convey the same
concepts. Both approaches contribute to the challenging task of abstractive summarization, where the goal is
to create concise and human-like summaries that capture the core ideas of the source material.
Conclusion
In the world of making summaries from text, we’ve been exploring different ways to capture the main ideas
and make things shorter. We’ve learned how to condense information to match various needs and goals. As we
conclude our exploration, we find ourselves equipped with a good understanding of three fundamental types of
summarization techniques: those based on input, output type, and purpose.
Exploring both single and multiple documents, weve seen how to make information shorter. Whether
from an individual source or a collection of related documents, the ability to distill key information equips us to
navigate the sea of content more efficiently.
Unveiling the concept of summarization based on purpose, weve been introduced to the dynamic trio
of generic, query-based, and domain-specific summarization.Every type has its own special way of making
summaries that match what users need. This can be as general as getting the main ideas, or as specific as
8
2.5 Semantic-Based Approach
answering questions or focusing on particular topics.
Introducing the idea of summarization focused on the output, we delve into two techniques known as
extractive and abstractive summarization. These approaches assist in selecting crucial sentences or generating
fresh ones to grasp the key ideas.
All these techniques contribute significantly to enhancing our understanding of text summarization. By
delving into these methods, we gain valuable insights into the various ways to condense information effectively.
These approaches collectively equip us with the knowledge and tools needed to master the art of creating concise
and meaningful summaries.
References
Text Summarization in Natural Language Processing,Sakshi Kulkarni, medium, May,2022
An Introduction to Text Summarization using the TextRank Algorithm,Prateek Joshi , analytics vidhya,
May,2023
Extractive vs Abstractive Summarization in Healthcare,Vince Hartman, abstractive health, Novem-
ber,2022
Introduction to Extractive and Abstractive Summarization Techniques,Adrien Payong, paperspace, May,2023
Towards Automatic Text Summarization: Extractive Methods, medium, January,2019
Approaches to Text Summarization: An Overview,Matthew Mayo, KDnuggets,November,2022
Concept of Text Summarization,Dilip Valeti, medium, October,2021
About the Author
Jinsu Ann Mathew is a research scholar in Natural Language Processing and Chemical Informatics.
Her interests include applying basic scientific research on computational linguistics, practical applications of
human language technology, and interdisciplinary work in computational physics.
9
Part II
Astronomy and Astrophysics
Guide to Practical Machine Learning for
Astronomy - Part I
by Linn Abraham
airis4D, Vol.1, No.9, 2023
www.airis4d.com
This article is meant to be a guide for people from a scientific background who are interested in venturing
into the field of machine learning but lacking the technical know-how. This is the first in a series of articles
trying to achieve the goal. Before we go in-depth, let us first try to summarize the journey of a machine learning
researcher working on solving an astrophysical problem.
1.1 Stages of a Machine Learning Project
1. The first stage of the machine learning researcher’s journey involves setting up his or her computer.
This involves deciding on the hardware requirements, operating system, programming language and
frameworks to use. There are also add on modules or libraries that are helpful to install, as well as IDEs
and other tools that makes your life easier.
2. The second stage is often the most difficult. Here you need to define a scientific problem to solve using
ML or DL. A survey of scientific problems that exist in your field of study is a necessary pre-requisite.
You also need to have a good grasp of the capabilities and limitations of ML or DL techniques to make a
decision here.
3. If you have a well-defined problem, then in the third stage, you need to worry about the data and algorithms.
Making visualization of your data helps a lot towards understanding its nature as well as deciding which
methods to use. Several parameters like the size and availability of your data, the complexity of the
problem and the computational time and expense that are affordable etc. go into deciding the methods to
be used.
4. In the next stage, you need to process your data before you pass it your ML/DL model. You also need to
design and implement your model. Both these things require writing good code.
5. The next stage consists of the model fitting or the training stage. This is followed by the evaluation of
your model performance.
6. The final stage is where you use all the feedback to go back and make improvements in your model so as
to increase its performance.
The rest of the article deals with first stage of the journey which is all about having the proper technical
setup.
1.2 Setting up your computer
1.2 Setting up your computer
You need a computer at two different stages of the journey - one while developing your code and one for
running the training of your ML algorithms. You need a much more powerful system for running most modern
ML/DL algorithms that what is required for developing the code. Thankfully there are free and paid options
available to do the former that are much better than owning your own hardware. For the latter, if you have the
budget to own a laptop with a dedicated graphics card, make sure you only choose laptops that supports the
NVIDIA GPUs. But for developing code you do not need a beefy system. Even a machine which just has an
Intel i3 processor or its Ryzen equivalent and a 4GB RAM would suffice. If your laptop comes preinstalled
with Windows, try to get it to dual boot a Linux OS. Linux is your best friend when it comes to the world of
code development. The term Linux doesnt refer to any particular OS that you can install. However there are
various flavours or distributions of Linux which you can actually install on your machine. Ubuntu is one of the
most popular Linux distributions. However if you are looking for something more interesting that Ubuntu, Arch
based distributions like Endeavour OS are a good option. The Arch User Repository or (AUR) is one of the best
features of Arch Linux that makes it easy for beginners to install several packages that are most often missing
from official Linux repositories. Once you are comfortable with using Linux and you feel as if you want to get
into the inner workings, one of the best ways to start is by installing the [Arch Linux] OS. You can do this on
something like a spare pendrive by following the arch installation guide.
1.3 The programming setup
Once the OS is setup, deciding on a programming language to use is the next step. You do not have to
make much of a decision here. Python is the language that is most suited for machine learning development. It
comes pre-installed on most Linux distributions. One of the main advantages of Python is it simplicity and free
and open-source nature. This has gained Python a large number of users that have contributed a lot of code.
These are available as modules that are easily installed from a central repository. The most commonly used
modules for ML developers are the following: Numpy, Pandas, Matplotlib, Scipy, Scikit-Learn, PIL, OpenCV
and Tensorflow. The program that helps you install packages from the [Python Packaging Index] is called
PIP an acronym for Pip Installs Packages. Sometimes if you are required to use a Python version that is not
already installed and you do not have sufficient permission to install it, the [Miniconda] distribution becomes
a useful tool. Conda environments are slightly different from ordinary Python virtual environments because
they are capable of installing different versions of Python as well non-python dependencies that are sometimes
required for installing Python packages. It does this using a container like environment that keeps the installed
dependencies separate from the already installed ones in your system.
How do you start writing python code? There are two ways in which people write machine learning code.
The first way uses scripts - which are basically a text file with a .py extension. Any good old text editor would
suffice. However there are other software that can also come in handy. Terminal based editors like VIM, Nano,
Emacs, etc. can come in handy when you have to run code on remote servers where there is no graphical
interface. The second way is using Jupyter Notebooks - which are a custom JSON format file with a .ipynb
extension. Jupyter Notebooks enable the user to save both the code and the output of each line that can be
opened in a web browser. There also exists IDEs other than Jupyter Notebook that are worth considering.
VSCode and Sublime are two such options. Before you start using Python on your Linux be sure to check out
how to create a Python virtual environment. Always make sure that you are installing python packages inside a
virtual environment.
12
1.4 Code version control and Github
1.4 Code version control and Github
An important piece of technology that can make your life as a coder much easier in the long run is a version
control system. When you write code, you often find the need to undo your work. If you have saved your work
already, you might find it difficult to revert back unless you have saved versions of your code at different times.
Creating such versions manually can quickly become a hectic thing to manage. Git is the most popular version
control system. It allows you to make manual checkpoints (called commits) in your Git repository. A repository
in git is just the primary folder in your system that contains all of the data and code related to your machine
learning project. Keep in mind that Git is designed to track only text files and not image or binary data. Do not
version control your data using Git. There are other options that you can check out if you are interested in version
controlling your models or data (eg. [Data Version Control]). When using Git, it is advisable to frequently
create git branches in your repository every time you think of making an improvement to your existing code. If
the path taken is not to your liking throw away the branch. If after a while you feel like the improvements made
are there to stay, merge the branch to your already existing best version (called the master or main branch).
The popular code sharing platform [Github] is a collection of git repositories made public by various people
and companies. ML researchers can use Github to share their code for other people to use and review. They can
also share their code privately with other developers or mentors while in the developmental stage using private
repositories. Github provides a cloning method based on Personal Access Tokens (PAT) that makes it very
trivial to clone your private repositories on any system. Code repos in github that make use of Python would
mostly have a requirements.txt file that shows which external packages are required to run the code. Create such
a file for your own project. It is also a good practice to create a README file that mentions how the scripts in
the repo can be run.
1.5 Where to get help?
In this section I will list resources you can use to learn about the things that has been already discussed
and more things to read.
1. [Scipy Lectures] - Getting started with Python and learning scientific packages like Numpy, Matplotlib
and Scipy.
2. [Exercism] - A website for learning the basics of Python using exercises.
3. [PyImageSearch] - Adrian’s blog for Computer Vision
4. [Coursera] & [Udacity] - Machine Learning Courses
5. [Google Colaboratory] - Free resource for developing and/or training ML models. Jupyter notebook
running on a virtual machine in the cloud. Note that a single code can run only upto a maximum of 12
hours.
6. [Python Packaging Index] - Repository of python packages
7. [Gdown] - Python package for downloading data from google drive using a shareable link
8. Deep Learning with Python, [Chollet(2018)] - Introduction to Deep Learning using Python and the Keras
framework
9. [Arch Linux Wiki] - Solution to most troubles you might face on Linux
10. [Luke Smith Youtube channel] - Getting around to using an Arch Linux based OS
13
REFERENCES
References
[Arch Linux] Arch Linux. https://archlinux.org/.
[Python Packaging Index] Python Packaging Index. https://pypi.org/.
[Miniconda] Miniconda. https://docs.conda.io/en/latest/miniconda.html.
[Data Version Control] Data Version Control. https://dvc.org/.
[Github] Github. https://github.com/.
[Scipy Lectures] Scipy Lectures. https://scipy-lectures.org/.
[Exercism] Exercism. https://exercism.org/.
[PyImageSearch] PyImageSearch. https://pyimagesearch.com/.
[Coursera] Coursera. https://coursera.org/.
[Udacity] Udacity. https://www.udacity.com/.
[Google Colaboratory] Google Colaboratory. https://research.google.com/colaboratory/.
[Gdown] Gdown. https://pypi.org/project/gdown/.
[Chollet(2018)] Franc¸ois Chollet. Deep Learning with Python. Manning Publications Co, Shelter Island, New
York, 2018. ISBN 978-1-61729-443-3.
[Arch Linux Wiki] Arch Linux Wiki. https://wiki.archlinux.org/.
[Luke Smith Youtube channel] Luke Smith Youtube channel. https://www.youtube.com/@LukeSmithxyz.
About the Author
Linn Abraham is a researcher in Physics, specializing in A.I. applications to astronomy. He is
currently involved in the development of CNN based Computer Vision tools for classifications of astronomical
sources from PanSTARRS optical images. He has used data from a several large astronomical surveys including
SDSS, CRTS, ZTF and PanSTARRS for his research.
14
The Hertzsprung-Russell Diagram:
Exploring Stellar Evolution and Diversity
by Robin Jacob Roy
airis4D, Vol.1, No.9, 2023
www.airis4d.com
Stars, like the broader Universe, undergo changes over time. They come into existence within areas
of concentrated gas, where the accumulation process is activated by external factors like the influence of
neighboring supernovae. When numerous stars take shape in close proximity and typically within a similar
timeframe, this gathering is termed a star cluster. These clusters, referred to as star clusters, can exhibit varying
levels of mass and metal content based on the composition of their initial gas clouds. Astronomers employ the
Hertzsprung-Russell diagram to map out the developmental phase of a star.
The Hertzsprung-Russell diagram, commonly referred to as the HR diagram is a fundamental tool in
astronomy that provides insights into the characteristics and evolution of stars. Named after its creators, Danish
astronomer Ejnar Hertzsprung and American astronomer Henry Norris Russell, this graphical representation
is a powerful tool for categorizing stars based on their luminosity, temperature, spectral type, and evolutionary
stage. The HR diagram has revolutionized our understanding of stellar properties and their life cycles, and it
continues to be a cornerstone of astrophysical research. The HR diagram is shown in figure 1
Figure 1: The Hertzsprung-Russell diagram is a graphical representation that plots the temperatures of stars
against their luminosities. The position of a star in the diagram provides information about its present stage and
its mass. Source: ESO
2.1 Key Features of the HR Diagram
Figure 2: Comparing the Structure and Evolution of the Sun: From its Present State to its Future as a Red
Giant. Source: ESO
2.1 Key Features of the HR Diagram
1. Luminosity (Absolute Brightness): The luminosity of a star represents the total amount of energy it
radiates into space. It is plotted on the vertical axis of the HR diagram and is usually presented on a
logarithmic scale.
2. Temperature (Spectral Type or Color): The temperature of a star affects its color and spectral character-
istics. The temperature scale is typically shown on the horizontal axis, ranging from cooler red stars to
hotter blue stars.
3. Spectral Type: The spectral type of a star is determined by its surface temperature and is usually classified
using the letters O, B, A, F, G, K, and M. These letters correspond to specific temperature ranges, with O
being the hottest and M being the coolest.
4. Evolutionary Stage: The HR diagram allows astronomers to track the evolution of stars from their birth
to their eventual death. Stars follow distinct paths on the diagram as they change over time, transitioning
from the main sequence to various evolutionary stages such as red giants, supergiants, and white dwarfs.
2.2 Various Objects on the HR Diagram
Main Sequence Stars: Main sequence stars, often referred to as the workhorses of the cosmos, represent
the most abundant and long-lasting stage in a stars life cycle. These stars, including our Sun, achieve
equilibrium between gravitational collapse and nuclear fusion in their cores, emitting a steady stream of
energy in the form of light and heat. Their luminosity and surface temperature follow a well-defined
relationship, creating the iconic diagonal band on the Hertzsprung-Russell diagram. Main sequence stars
fuse hydrogen into helium, sustaining their brilliance for billions of years while serving as cosmic beacons
that shape the fundamental properties of galaxies and the evolution of the universe.
Red Giants and Supergiants: As stars exhaust their hydrogen fuel, they expand and cool, moving away
from the main sequence toward the upper right of the HR diagram. Red giants and supergiants are larger
and more luminous than main sequence stars, with evolved cores undergoing helium fusion or other
nuclear reactions. In about 6 billion years, as the Sun enters its red giant phase, the depletion of fuel
within its core will lead to a gradual slowdown of hydrogen fusion. Consequently, the core will undergo a
contraction process, increasing its temperature until it triggers a renewed phase of nuclear fusion. During
this phase, helium will fuse into more complex elements like carbon, nitrogen, and oxygen. Concurrently,
the elevated core temperature will drive hydrogen fusion in the surrounding ”shell” of material enveloping
the core. Simultaneously, internal heat generation deep within the star will result in the expansion of its
16
2.2 Various Objects on the HR Diagram
Figure 3: Captured by the Hubble Space Telescope, the image portrays Sirius A and Sirius B. In the image,
the much brighter Sirius A is prominent, while the fainter Sirius B, a white dwarf, is visible as a subtle point of
light positioned towards the lower left of Sirius A.. Source: ESA
outer gas layer. The figure 2 depicts a size comparison of our Sun in its current phase and its red giant
phase.
White Dwarfs: These are the remnants of low- to medium-mass stars that have exhausted their nuclear
fuel and undergone gravitational collapse. These incredibly dense objects pack a mass comparable to that
of the Sun into a volume roughly equivalent to that of Earth. Supported by electron degeneracy pressure,
white dwarfs are stable and gradually cool over billions of years, transitioning from a luminous and hot
state to becoming fainter and cooler. Their evolutionary paths are influenced by their initial masses and
the processes leading to their formation, making them crucial in understanding stellar life cycles and the
ultimate fate of stars. Figure 3 illustrates a size comparison between Sirius B, a white dwarf, and its
companion, Sirius A.
Protostars and Pre-Main Sequence Stars: Objects that are still in the process of gravitational contraction
and heating before they start hydrogen fusion fall in the protostar or pre-main sequence region of the
HR diagram. These objects are often deeply embedded in gas and dust clouds and can exhibit strong
variability.
Blue Supergiants and Wolf-Rayet Stars: These extremely luminous stars are massive and hot, often located
in the upper-left region of the HR diagram. They have high rates of mass loss due to strong stellar winds.
Variable Stars: Stars with irregular or periodic variations in brightness, such as Cepheid variables and
RR Lyrae stars, can be found in specific regions of the HR diagram. These luminosity fluctuations can
stem from various intrinsic or extrinsic factors, such as pulsations, eclipses, or eruptions. By monitoring
the patterns of variation, astronomers can deduce crucial information about the stars properties, includ-
ing their distances, sizes, temperatures, and evolutionary stages. These stars are crucial for distance
measurements and studying cosmic distances.
In conclusion, the Hertzsprung-Russell diagram serves as a visual roadmap for understanding the diverse
life cycles of stars, their characteristics, and their behaviors. By analyzing the distribution of stars on this
diagram, astronomers can glean valuable insights into the processes that shape the universe and deepen our
understanding of stellar evolution.
References:
HR diagram background
17
2.2 Various Objects on the HR Diagram
Hertzsprung-Russell Diagram
Lives and Deaths of Stars
Evolution of high mass stars
About the Author
Robin is a researcher in Physics specializing in the applications of Machine Learning for Astronomy
and Remote Sensing. He is particularly interested in using Computer Vision to address challenges in the fields of
Biodiversity, Protein studies, and Astronomy. He is presently engaged in utilizing machine learning techniques
for the identification of star-forming knots.
18
A Journey into the Enigmatic Universe of
X-ray Binaries
by Sindhu G
airis4D, Vol.1, No.9, 2023
www.airis4d.com
3.1 X-ray binaries
X-ray binaries (Fig: 1) encompass a category of binary star systems comprising a compact object and a
companion star. The companion star has the possibility of being a main sequence star, a giant star, or, on rare
occasions, a white dwarf. The compact object is a stellar remnant, either a neutron star or a black hole. A binary
system with a white dwarf as the compact object is called a cataclysmic variable (CV). The accretor is the term
used for the compact object in an X-ray binary, while the donor is the term applied to the companion star.
The label ”X-ray binaries” originates from their substantial emission of X-ray radiation. Within X-ray
binaries, the accumulation of material onto the compact object can transpire through two principal processes:
Roche lobe overflow and stellar wind accretion. The gravitational attraction pulls matter from the companion
star towards the compact object. As this material descends onto the compact object, it coalesces into an
accretion disk—a dynamic disk of gas and particulates encircling the compact entity. The presence of friction
and gravitational forces within the accretion disk results in a considerable rise in material temperatures, giving
rise to the emission of X-rays.
3.2 Classification
X-ray binaries can be categorized into three main types according to the mass of the star that is losing
mass, namely, the companion star.
3.2.1 Low-mass X-ray binaries (LMXBs)
Low-mass X-ray binaries, often referred to as LMXBs(Fig: 2), belong to a class of X-ray binaries where
the companion star is usually a low-mass star, such as a main sequence star or a white dwarf. The compact
object can be either a neutron star or a black hole. Low-mass X-ray binaries lack a strong stellar wind. In these
systems, mass transfer from the companion star to the compact object occurs through Roche lobe overflow.
As the companion star fills its Roche lobe, material surpassing its limits begins to overflow, gravitating
towards the compact object. This ejected material forms an accretion disk encircling the compact object,
creating a dynamic spiral inward due to gravitational forces. LMXBs are characterized by their relatively low
3.2 Classification
Figure 1: Artist’s conception of an X-ray binary systemSource: NASA/GSFC.
Figure 2: Low Mass X-ray Binary Source: NASA.
20
3.2 Classification
Figure 3: An artist’s impression of a High Mass X-ray BinarySource: NASA/CXC/M.Weiss.
X-ray luminosity. Around 200 LMXBs are presently recognized within the Milky Way galaxy, with 13 having
been identified within globular clusters.
Typically, LMXBs emit the majority of their radiation in the X-ray spectrum, with visible light emissions
significantly fainter, constituting less than one percent of the total radiation output. These binaries typically
exhibit apparent magnitudes ranging from 15 to 20. The orbital periods of LMXBs can exhibit a wide range,
spanning from brief durations of ten minutes to extensive periods lasting hundreds of days.
In LMXBs, the companion star tends to be a late-type star, encompassing spectral types F, G, K, and M.
These stars are cooler and less massive than early-type counterparts like A-type stars. The variability of LMXBs
is often observed through phenomena such as X-ray bursts, known as X-ray bursters, as well as X-ray pulsations.
3.2.2 High-mass X-ray binaries (HMXBs)
High-mass X-ray binaries, or HMXBs(Fig: 2), involve a companion star of considerable mass, typically a
massive star within its main sequence or giant phase. The compact object in HMXBs can either be a neutron
star or a black hole. The massive companion star in HMXBs produces strong stellar winds because of its intense
radiation and high luminosity. These winds carry material away from the companion star. A portion of this
material may be captured by the gravitational force of the compact object, resulting in the formation of an
accretion disk or a direct flow onto the compact object.
Within HMXBs, the massive star—often an O or B type star—typically governs the emission of optical
light, while the compact object serves as the primary source of X-rays. Variability within HMXBs is evident
through phenomena such as X-ray pulsars rather than X-ray bursters. Notably, HMXBs generally display a
greater X-ray luminosity in comparison to LMXBs.
3.2.3 Intermediate-mass X-ray binary (IXRBs)
As the name suggests, IXRBs have characteristics that fall between those of LMXBs and HMXBs. The
donor star in an IXRB is typically an intermediate-mass star, such as a F or G-type star, with a mass higher than
that of a low-mass star but lower than that of a massive star found in HMXBs. The compact object in an IXRB
can be a neutron star or a black hole. These systems can show a variety of X-ray behaviors and provide valuable
insights into the accretion processes and interactions between stars and compact objects.
21
3.2 Classification
References:
Formation of millisecond pulsars with heavy white dwarf companions: Extreme mass transfer on sub-
thermal timescales.
Evolutionary sequences for low- and intermediate-mass X-ray binaries
A catalogue of low-mass X-ray binaries in the galaxy, LMC,and SMC (Fourth edition)
WATCHDOG: A comprehensive all-sky database of galactic black hole X-ray binaries
X-ray binary stars
X Ray binaries monitoring
X-ray binary
About the Author
Sindhu G is a research scholar in Physics doing research in Astronomy & Astrophysics. Her research
mainly focuses on classification of variable stars using different machine learning algorithms. She is also doing
the period prediction of different types of variable stars, especially eclipsing binaries and on the study of optical
counterparts of x-ray binaries.
22
Part III
Biosciences
Microflora of the intestine
by Geetha Paul
airis4D, Vol.1, No.9, 2023
www.airis4d.com
1.1 Introduction
The intestinal microflora, also known as gut microbiota, is a complex ecosystem containing over 400 to
1000 bacterial species inhabiting the digestive tract of humans. The upper gastrointestinal tract, including
the stomach, duodenum, jejunum, and upper ileum, typically possesses a sparse microflora, with anaerobes
(microorganisms live in the absence of oxygen) outnumbering facultative anaerobes. The bacterial concentration
is less than 10
4
organisms/ml of intestinal secretions. The flora is sparse in the stomach and upper intestine but
luxuriant in the lower bowel. Bacteria occur both in the lumen and attached to the mucosa but do not usually
penetrate the bowel wall. This intricate ecosystem encompasses various microorganisms coexisting within the
intestinal environment, such as bacteria, archaea, fungi, and viruses. The microflora is fundamental in essential
physiological processes and significantly contributes to overall host health and functioning. The delicate balance
and composition of the intestinal microflora hold profound implications for vital functions, including digestion,
nutrient absorption, immune system modulation, and protection against harmful pathogens.
Within this community, microorganisms interact among themselves and with the host, forming complex
relationships that collectively maintain the stability and proper functioning of the gut environment. The
relationship between the gut microbiota and human health is being increasingly recognised. It is now well
established that a healthy gut flora is mainly responsible for the overall health of the host. Diverse factors,
including diet, age, genetics, and environmental exposures, influence the composition of the intestinal microflora.
Variations in this microflora can impact individual health outcomes and susceptibility to various diseases. A
disruption in the balance of the microflora, known as dysbiosis, has been associated with various health
conditions, encompassing inflammatory bowel diseases, metabolic disorders, and mental health issues. Ongoing
research into the intestinal microflora has advanced our understanding of its intricate roles in preserving
homeostasis and has led to potential therapeutic interventions. Advances in sequencing technologies have
enabled the identification and characterisation of specific microbial species and their functional roles within the
gut ecosystem. This knowledge has paved the way for developments in personalised medicine and strategies
aimed at modulating the gut microflora to enhance overall health and well-being. In summation, the intestinal
microflora is a dynamic and pivotal component of the gastrointestinal system, exerting substantial influence on
diverse aspects of host health and functioning. Its multifaceted interactions and roles remain subjects of active
research, holding promising avenues for expanding our comprehension of human health and disease.
1.2 Composition of the normal gut biota
(image courtesy: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4528021/)
Figure 1: Distribution of the average human gut flora.
1.2 Composition of the normal gut biota
The gut microbiota is highly diverse, with recent research revealing a more extensive array of microbial
species and genetic potential than previously thought. Initially, the gut microbiota was believed to consist of
around 500-1000 species, but a recent study estimated over 35,000 bacterial species. The Human Microbiome
Project and MetaHIT studies indicate that there could be over 10 million non-redundant genes in the human
microbiome. Danish research introduced the concept of High Gene Count (HGC) and Low Gene Count
(LGC) microbiomes, impacting health and disease. The HGC microbiome has a robust composition, including
beneficial bacteria like Akkermansia, associated with digestive health and lower metabolic disorders. LGC
individuals have a higher proportion of pro-inflammatory bacteria linked to conditions like inflammatory
bowel disease. The healthy gut microbiota consists mainly of Firmicutes, Bacteroidetes, Actinobacteria and
Verrucomicrobia. The composition varies along the gastrointestinal tract and from lumen to mucosal surface.
Longitudinal and axial differences in microbial presence contribute to the complexity of the gut microbiota.
This information underscores the gut microbiotas intricate diversity and functional importance in health and
disease.
The figure explains the differential pH and occurrence of the type of bacteria in different parts of the
intestine.
1.3 Factors that contribute to the composition of the typical gut microbiota
Birth Delivery Mode: The delivery method, whether vaginal or caesarean, can impact the initial gut
microbiota colonisation.
Infancy and Adult Diet: The diet followed during infancy, including breast milk or formula feeding, and
dietary choices in adulthood, such as vegan or meat-based diets, can influence the gut microbiota makeup.
25
1.4 Current methods to study gut microbiota
(image courtesy: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4528021/)
Figure 2: Concentration of the bacterial flora in regions of the gastrointestinal tract.
Antibiotic Usage and Environmental Compounds: The utilisation of antibiotics or antibiotic-like com-
pounds, whether from the environment or the gut’s commensal community, is noteworthy. However, it raises
concerns regarding potential long-term disturbances to the healthy gut microbiota and the potential horizontal
transmission of resistance genes. This could create a reservoir of microorganisms housing a diverse pool of
multidrug-resistant genes.
It is crucial to acknowledge that antibiotics might disrupt the equilibrium of the gut microbiota, with
possible long-term consequences. Additionally, there is a risk of the horizontal transmission of antibiotic-
resistance genes among microorganisms, which could further contribute to developing multidrug-resistant
strains.
1.4 Current methods to study gut microbiota
Contemporary methods for investigating the gut microbiota involve advanced techniques that have revo-
lutionised our understanding of this complex microbial ecosystem. Traditional culture-based methods, which
faced limitations in isolating anaerobic microorganisms, have been supplemented and even replaced by more
comprehensive and efficient methodologies:
1.4.1 DNA Isolation
Stool samples are collected from individuals, and DNA is extracted from the stool. The DNA contains
genetic information of the microbial community residing in the gut. High-throughput sequencing of a specific
microbial gene, usually the 16S rRNA gene, is performed. This gene is highly conserved among bacteria but
contains variable regions that allow distinguishing between different species or genera. This method provides a
snapshot of the microbial diversity in the sample.
26
1.5 Bioinformatics Analysis
1.4.2 Bacterial Gene Sequencing
Sequencing of bacterial genes involves metagenomic analysis of DNA that codes for the 16S rRNA. The
16S region of the bacterial gene is small (1.5 Kb size) and highly conserved, with nine hypervariable sites
sufficient to differentiate various bacterial species.
Sequencing:
The extracted genetic material is subjected to high-throughput sequencing techniques like next-generation
sequencing (NGS). This results in a vast amount of sequencing data that needs analysis.
Data Quality Control and Preprocessing:
Raw sequencing data undergoes quality control steps to remove noise and errors. This includes trimming
low-quality reads and filtering out artefacts.
Taxonomic and Functional Profiling:
Various tools are used to analyse the sequencing data. Taxonomic profiling identifies the composition of
microbial species in the sample, often using tools like MetaPhlAn or MEGAN. Functional profiling assesses
the potential functional capabilities of the microbial community using tools like MG-RAST, KEGG, COG, or
TIGRFAM.
Host-Microbe Interaction Analysis:
Understanding the interaction between host and microorganisms is crucial. Tools like PICRUSt, FANTOM,
HUMAn, and CAZy can help predict the functional roles of the microbiota and identify any influence on host
metabolism.
Statistical Analysis:
Statistical methods are applied to analyse the data, identify significant differences between groups, and
explore correlations between microbial species and host health conditions. R and Python libraries are often
used for this purpose.
Functional Annotation:
Functional annotation involves assigning specific functions to the identified genes or gene products. This
helps understand the potential roles of microbial communities in various biological processes.
1.5 Bioinformatics Analysis
Following sequencing, bioinformatics tools are employed to analyse the data. These tools help identify
the different microbial species or groups in the sample based on the sequences obtained. By comparing the
sequences to existing databases, researchers can gain insights into the composition and relative abundance of
the gut microbiota.
This figure explains the various steps involved in the bioinformatics analysis, starting from collection of
samples, extraction, sequencing and statistical analysis. The interaction between host and microbes, along
with the functional capacity of the microbiota, can be studied. MG-RAST: Metagenomics rapid annotation
using subsystem technology; CAZy: Carbohydrate active-enzymes; MetaPhlAn: Metagenomic phylogenetic
analysis; KEGG: Kyoto encyclopaedia for genes and genomics; COG: Clusters of the orthologous group;
PICRUst: Phylogenetic investigation of communities by reconstruction of unobserved states; MEGAN: Meta
genome analyser; MEDUSA: Metagenomic data utilisation and analysis; FANTOM: Functional annotation
and taxonomic analysis of metagenomes; HUMAan: Human microbiome project unified metabolic analysis
network; BLAST: Basic local alignment search tool; TIGRFAM: Protein sequence classification; PFAM: Protein
families; SOAP: Short oligonucleotide analysis package; QIIME: Quantitative insights into microbial ecology.
27
1.6 Metabolomics
(image courtesy: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4528021/)
Figure 3: Bioinformatics workflow.
1.6 Metabolomics
Metabolomics is an expanding field that examines the small molecules produced by both the host and the
gut microbiota due to their interactions. These molecules play a role in various metabolic pathways and can
affect health and disease. By studying the metabolome alongside the gut microbiota composition, researchers
can better understand how these microbial communities influence overall health.
1.7 Integrated Data Analysis
Combining data from microbial DNA sequencing and metabolomics provides a comprehensive overview of
the interplay between the gut microbiota and host metabolism. This integrated approach offers a more accurate
assessment of the relationship between the gut microbiota and various health and disease states.
These advanced methods have several advantages over traditional culture-based techniques. They allow for
studying a broader range of microorganisms, including the previously challenging-to-culture anaerobic species.
Moreover, these methods are less time-consuming and provide a more detailed and accurate representation of
the complex microbial communities in the gut. As technology continues to evolve, our ability to unravel the
intricate connections between the gut microbiota, host health, and disease will continue to improve.
The workflow encompasses the entire process of turning raw biological samples into valuable insights
about the microbial communities and their interactions with the host.
References
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4528021/
https://www.sciencedirect.com/science/article/abs/pii/S1877117322000904
https://www.sciencedirect.com/science/article/abs/pii/B978012804024900029X
https://www.ncbi.nlm.nih.gov/books/NBK7670/#:˜:text
Peterson DA, Frank DN, Pace NR, Gordon JI. Metagenomic approaches for defining the pathogenesis
of inflammatory bowel diseases. Cell Host Microbe. 2008;3:417–427. (https://pubmed.ncbi.nlm.nih.gov/
18541218)
28
1.7 Integrated Data Analysis
Hamady M, Walker JJ, Harris JK, Gold NJ, Knight R. Error-correcting barcoded primers for pyrosequencing
hundreds of samples in multiplexes. Nat Methods. 2008;5:235–237.
About the Author
Geetha Paul is one of the directors of airis4D. She leads the Biosciences Division. Her research
interests extends from Cell & Molecular Biology to Environmental Sciences, Odonatology, and Aquatic Biology.
29
Part IV
Computer Programming
Understanding Convolutional Neural
Networks
by Ninan Sajeeth Philip
airis4D, Vol.1, No.9, 2023
www.airis4d.com
Though evolved along with Artificial Neural Networks (ANN) that use extracted features and the back-
propagation algorithm for training a model to map between the input feature space and the output target space,
the Convolution Neural Network (CNN) got more attention later when computational resources became more
easily accessible. CNNs are best suited to handle grid-like data, an example of which are images. This is one
area in which ANNs struggled to perform even marginally due to the difficulty in extracting reliable features for
their classification.
Let us consider face recognition as an example. The distance between the eyes and the ratios of various line
segments joining different facial features can all be considered. But as soon as the cameras orientation changes,
all these values change or sometimes become inaccessible. What is required would be a comprehensive model
that considers the correlation between every region of the face and all possible orientations. This is essentially
what CNNs do.
CNNs were first introduced in the early 1980s, but they only became widely used in the 2010s when
advances in machine learning and computing power made it possible to train CNNs on large datasets of images.
CNNs have since been used to achieve state-of-the-art results on various image recognition tasks. We will try to
understand how CNNs can generate features automatically from images or, to that matter, any array of numbers
in this article.
(image courtesy: https://edri.org/our-work/facial-recognition-and-fundamental-rights-101/)
Figure 1: Extraction of features from an image - traditional way
(image courtesy: DOI:10.3390/app9102028)
Figure 2: A simple CNN Architecture
Though the concept of Convolution was widely known, the first practical and popular application was
the LeNet-5 developed by Yann LeCun in 1998. Its success in the recognition of handwritten digits brought
it widespread recognition. However, due to the unavailability of computational power, ANNs remained more
popular in machine learning till 2010 when GPUs and massively parallel computing computers revolutionalised
the entire field. Several computational competitions encouraged researchers to enhance the features of existing
CNNs. The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) held in 2012 witnessed the
groundbreaking success of AlexNet, developed by Alex Krizhevsky.
The CNN consists of several layers, each of which learns features from images hierarchically, making
recognising complex structures much more efficient than manually extracted features used as input to ANNs.
This hierarchical decomposition also allows CNNs to learn the spatial relationship between pixels in an image.
Though image recognition was the central application area handled by CNNs, because of its impressive
properties to decompose data hierarchically, it is common to find one-dimensional CNNs.
The CNN Architecture
Let us consider image recognition as an example to understand CNNs. The image can be a grayscale image
or a colour image. The greyscale image is a 2D array of numbers where each number indicates the pixel’s
brightness at that location. So, if the image has 32 pixels along the x-axis and another 32 along the y-axis, it
forms a 2D array of 32X32 pixels. For colour images, we have a separate array for each of the Red, Green
and Blue colours that form the image. Hence, it will combine three 2D arrays each for a colour. Hence, the
dimension is 32X32X3. This is the size of the input layer that accepts the input image.
The input layer of size 32x32 for a grey-scale image can be implemented in Python with the command
Conv2D(32, (3, 3), activation=’relu, input shape=(32, 32, 1)). Here, the input shape (32,32,1) means that the
image has only one layer and is composed of 32 by 32 pixels. So, the only difference in the code regarding a
colour image is that instead of (32,32,1), the input shape will become (32,32,3).
As the name suggests, Convolution Neural Networks use Convolution to extract features from an image.
The convolution is done using a more miniature Matrix called the kernel. A typical kernel has a dimension
(3,3) to convolve the central pixel with those around it. For example, assume that the Kernel has the values 1
along the first column, 0 along the second and -1 along the third. An illustration of the convolution operation is
shown in Figure 3.
Though the illustration is with a simple kernel and a simple image, note that in the actual CNN, each layer
will consist of many kernels designed to extract all valuable features from an image. Since deciding the useful
kernels a priori is impossible, we leave that task for the machine to handle. This is what we mean by training.
It may be seen that the size of the Feature map is different from that of the input image. This change is size
is determined by the kernel dimensions and is given by the formula.
Output size = (Input size - Kernel size + 2 * Padding) / Stride + 1.
32
(a) The (3x3) kernel
(b) An input image of size (5,5) with some arbitrary
values
(c) The convolution operation generates the feature map
as output
(d) The operation is done by sliding the kernel over the
image by a predefined step size (here 1) and updating
the Feature Map.
Table 1.1: Source: https://medium.com/@rathna211994/convolution-neural-networks-cnn-b6fe90214b1e
Here, Padding and Stride are two new terms. The padding adds extra pixels (often zeros) around the input
data before applying convolution. It helps maintain the spatial dimensions of the feature map and can influence
whether the feature map is smaller or the same size as the input. The stride indicates how many pixels the kernel
moves in each step. A stride of 1 means the kernel shifts by 1 pixel, while a larger stride skips more pixels.
The machine initialises each kernel with random values during the first training round. In subsequent
training times (epochs), the already determined kernel values are used for initialisation. The image is then
presented through the input layer (Figure 2), and the feature map is generated. To generate nonlinearity, an
activation function is applied to every map element. A typical activation function is the ReLu (Rectified Linear
Activation) that replaces every negative value with zero and the positive values unchanged.
The generated map is then subjected to a Pooling Layer, a downsampling procedure that reduces the spatial
dimension of the data while retaining essential features. This is important to reduce the computational load
and memory usage and improve the network’s generalisation ability. There are mainly two types of Pooling:
Max Pooling and Average Pooling. We define a window size like that of the kernel and then, in case of max
pooling, replace the corresponding pixel in the resulting Future map with the maximum value in the window or
the average values in case of Average Pooling. This can be implemented in Python as MaxPooling2D((n,n)),
33
where n represents the window size. Typically, n is set to 2.
The output of the Pooling layer goes to the subsequent Convolution and pooling layers. The complexity of
the data determines the number of subsequent such layers. A typical value is two. Since the layers are arranged
sequentially, this model is often called Sequential.
The output of the final convolution layer is converted into a one-dimensional sequence of values by a
process called Flatten. This is to unwrap the 2D array by attaching each row to the end of the previous one. This
is required to build the final layer of the model using a Fully Connected Feed Forward Network that can handle
only one-dimensional data. In structure, it is very similar to the ANN layer, with the capability to drop out
connections that do not significantly contribute to the network’s performance. The network’s last layer usually
uses a softmax function as activation to combine the probability assigned by the network to each probable
outcome to estimate the most probable class of a given input image.
In Python, the entire CNN model can be represented conveniently using Keras as:
model = Sequential([
Conv2D(32, (3, 3), activation=’relu’, input\_shape=(32, 32, 3)),
MaxPooling2D((2, 2)),
Conv2D(64, (3, 3), activation=’relu’),
MaxPooling2D((2, 2)),
Conv2D(64, (3, 3), activation=’relu’),
Flatten(),
Dense(64, activation=’relu’),
Dropout(0.5),
Dense(10, activation=’softmax’)
])
We are now ready to wet our hands with a working CNN model on the popular CIFAR-10 dataset that
consists of 60,000 32x32 colour images in 10 different classes. Each class contains 6,000 images. The dataset
is split into 50,000 training images and 10,000 testing images.
TO run the code below, you need to install tensorflow along with Python. You can use pip to install it.
The command is : “pip install tensorflow”. Now copy and paste the following code to a file and save it as
simple CNN.py. You can run it with the command “python simple CNN.py”. When you execute it, make sure
that you have a stable internet connection because you want to download the cifar-10 dataset.
The preprocessing step in the code rescales the pixel values to fal between 0 and 1. Usually images have
the value between 0 and 255 which is the reason why it is divided by 255. In case of images with a different
value range, appropriate modifications should be made. It also creates the target variables to categorical since
our names are going to be alphanumeric.
#Import necessary libraries
import tensorflow as tf
from tensorflow.keras.datasets import cifar10
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout
from tensorflow.keras.utils import to\_categorical
34
# Load and preprocess the CIFAR-10 dataset
(x_train, y_train), (x_test, y_test) = cifar10.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
y_train, y_test = to_categorical(y_train, num_classes=10), to_categorical(y_test, num_classes=10)
# Create the CNN model
model = Sequential([
Conv2D(32, (3, 3), activation=’relu’, input\_shape=(32, 32, 3)),
MaxPooling2D((2, 2)),
Conv2D(64, (3, 3), activation=’relu’),
MaxPooling2D((2, 2)),
Conv2D(64, (3, 3), activation=’relu’),
Flatten(),
Dense(64, activation=’relu’),
Dropout(0.5),
Dense(10, activation=’softmax’)
])
# Compile the model
model.compile(optimizer=’adam’, loss=’categorical_crossentropy’, metrics=[’accuracy’])
# Train the model
model.fit(x_train, y_train, epochs=10, batch_size=64, validation_data=(x_test, y\test))
# Evaluate the model on the test data
test_loss, test_acc = model.evaluate(x_test, y_test)
print(f{\textquotedbl}Test accuracy: {test_acc}{’dbl’})
# Make predictions on new data
predictions = model.predict(x_test[:10])
predicted_classes = [tf.argmax(prediction).numpy() for prediction in predictions]
print({textquotedbl}Predicted classes:{’dbl’}, predicted_classes)
About the Author
Professor Ninan Sajeeth Philip is a Visiting Professor at the Inter-University Centre for Astronomy
and Astrophysics (IUCAA), Pune. He is also an Adjunct Professor of AI in Applied Medical Sciences [BCMCH,
Thiruvalla] and a Senior Advisor for the Pune Knowledge Cluster (PKC). He is the Dean and Director of airis4D
and has a teaching experience of 33+ years in Physics. His area of specialisation is AI and ML.
35
About airis4D
Artificial Intelligence Research and Intelligent Systems (airis4D) is an AI and Bio-sciences Research Centre.
The Centre aims to create new knowledge in the field of Space Science, Astronomy, Robotics, Agri Science,
Industry, and Biodiversity to bring Progress and Plenitude to the People and the Planet.
Vision
Humanity is in the 4th Industrial Revolution era, which operates on a cyber-physical production system. Cutting-
edge research and development in science and technology to create new knowledge and skills become the key to
the new world economy. Most of the resources for this goal can be harnessed by integrating biological systems
with intelligent computing systems offered by AI. The future survival of humans, animals, and the ecosystem
depends on how efficiently the realities and resources are responsibly used for abundance and wellness. Artificial
intelligence Research and Intelligent Systems pursue this vision and look for the best actions that ensure an
abundant environment and ecosystem for the planet and the people.
Mission Statement
The 4D in airis4D represents the mission to Dream, Design, Develop, and Deploy Knowledge with the fire of
commitment and dedication towards humanity and the ecosystem.
Dream
To promote the unlimited human potential to dream the impossible.
Design
To nurture the human capacity to articulate a dream and logically realise it.
Develop
To assist the talents to materialise a design into a product, a service, a knowledge that benefits the community
and the planet.
Deploy
To realise and educate humanity that a knowledge that is not deployed makes no difference by its absence.
Campus
Situated in a lush green village campus in Thelliyoor, Kerala, India, airis4D was established under the auspicious
of SEED Foundation (Susthiratha, Environment, Education Development Foundation) a not-for-profit company
for promoting Education, Research. Engineering, Biology, Development, etc.
The whole campus is powered by Solar power and has a rain harvesting facility to provide sufficient water supply
for up to three months of drought. The computing facility in the campus is accessible from anywhere through a
dedicated optical fibre internet connectivity 24×7.
There is a freshwater stream that originates from the nearby hills and flows through the middle of the campus.
The campus is a noted habitat for the biodiversity of tropical Fauna and Flora. airis4D carry out periodic and
systematic water quality and species diversity surveys in the region to ensure its richness. It is our pride that
the site has consistently been environment-friendly and rich in biodiversity. airis4D is also growing fruit plants
that can feed birds and provide water bodies to survive the drought.