Cover page

Species Name: Crocothemis servilia.

Crocothemis servilia is commonly known as scarlet skimmer, and belongs to the Order Odonata, Family

Libellulidae. In this picture Crocothemis servilia is in Obelisk posture. The ”obelisk posture” refers to a

handstand-like position adopted by certain species of dragonﬂies and damselﬂies in order to regulate their body

temperature and prevent overheating on sunny days. This unique behavior involves the elevation of the abdomen

until its apex is oriented toward the sun, thereby reducing the amount of body surface exposed to direct solar

radiation. This alignment becomes most pronounced when the sun is positioned almost directly overhead,

creating a visual resemblance to an obelisk-like structure.

Managing Editor Chief Editor Editorial Board Correspondence

Ninan Sajeeth Philip Abraham Mulamootil K Babu Joseph The Chief Editor

Ajit K Kembhavi airis4D

Geetha Paul Thelliyoor - 689544

Arun Kumar Aniyan India

Jorunal Publisher Details

Publisher : airis4D, Thelliyoor 689544, India

Website : www.airis4d.com

Email : nsp@airis4d.com

Phone : +919497552476

Editorial

by Fr Dr Abraham Mulamoottil

airis4D, Vol.1, No.9, 2023

www.airis4d.com

The 9th edition of airis4D Journal must not overlook India’s recent strides in its space program. The voyage

to the moon began with Chandrayaan-1’s launch in 2008, marking a signiﬁcant milestone in India’s space journey.

This mission proved pivotal, marking India’s ﬁrst successful endeavour beyond Earth. Notably, Chandrayaan-

1’s breakthrough discovery of water molecules on the moon’s surface transformed scientists’ perception of our

closest cosmic neighbour. The mission’s orbiter played a pivotal role, opening doors for forthcoming missions

to investigate lunar resources and delve into the moon’s composition. Taking inspiration from Chandrayaan-1’s

success, India embarked on the ambitious Chandrayaan-2 mission in 2019. This mission aimed to expand lunar

knowledge by endeavouring to land a rover on the moon’s surface softly. Although the lander experienced

a challenging touchdown, the orbiter continued its lunar orbit, beaming back valuable data. Chandrayaan-2

underlined India’s prowess in executing intricate space manoeuvres and brought it closer to achieving a delicate

lunar landing. A groundbreaking achievement arrived in August 2023 with India’s most recent lunar endeavour,

Chandrayaan-3. It accomplished a triumphant landing near the lunar south pole, ﬁrmly establishing India’s status

as a prominent ﬁgure in the global space arena. Beyond the successful touchdown, Chandrayaan-3 showcased

India’s unwavering determination and cutting-edge space technology, solidifying its position as a trailblazer.

The mission’s success demonstrated prowess and underscored India’s commitment to exploring uncharted lunar

territories. These lunar expeditions encompass more than scientiﬁc quests; they encapsulate India’s aspirations

to transcend conventional boundaries in space exploration. They have yielded invaluable insights into lunar

geology and composition, laying the groundwork for future lunar endeavours and potentially even setting the

stage for human exploration. As India consistently invests in space technology, its lunar endeavours inspire its

citizens and the global community. These missions are not just contributions to scientiﬁc knowledge but also

sources of pride and curiosity that transcend geographical borders. The Indian Space Research Organisation

(ISRO) has notched remarkable milestones in space exploration, spotlighting India’s prowess in the ﬁeld.

However, persistent Western scepticism and condescension regarding India’s simultaneous investment in space

and eﬀorts to tackle domestic challenges like poverty remain a recurring narrative. Critics question resource

allocation, suggesting funds could be better applied elsewhere. India’s retort to these concerns underscores its

position as the world’s ﬁfth-largest economy and emphasises its accomplishments. It underscores that economic

growth and technological advancement are paralleled by initiatives addressing societal development. India’s

accomplishments in space challenge colonial narratives of inferiority and echo its ascent as a global force. These

successes resonate with historical contexts, countering economic exploitation during British colonisation and

spotlighting India’s determination to reshape its narrative on a global stage. Evidenced by India’s successful

lunar landings, claims of fabrication that have plagued other space programs are eﬀectively dispelled. ISRO’s

transparent sharing of scientiﬁc data validates the authenticity of India’s achievements and underscores its

dedication to credible space exploration. This ongoing debate underscores the signiﬁcance of a nation shaping

its narrative, sidestepping external manipulation and misrepresentation. This case reminds us that controlling

narratives is pivotal to presenting a balanced and precise portrayal of achievements and challenges. It reinforces

that narrative control is vital in avoiding external biases and misconceptions. In conclusion, India’s journey is

multifaceted, navigating intricate paths toward progress. A common misconception often arises: Can a country

eﬀectively pursue its development agenda while addressing poverty? The answer, as India has demonstrated,

is a resounding yes. India’s development agenda and poverty alleviation projects are not opposing forces;

they are two sides of the same coin, working to uplift the nation. These two facets synergise, ultimately

lifting the nation. This edition of airis4D explores ﬁrstly the ”Diﬀerence Boosted Neural Network” (DBNN)

architecture and its extension called ”Enhanced Diﬀerence Boosted Neural Network” (E-DBNN). The approach

enhances performance compared to traditional methods like Naive Bayes. The author, Blesson George, shares

E-DBNN’s Python code on GitHub and focuses on machine learning algorithms for protein studies. The second

article discusses text summarisation techniques in ”From Information Overload to Clarity: The Power of Text

Summarization (Part 2)” by Jinsu Ann Mathew. It covers extractive and abstractive summarisation methods.

Extractive summarisation selects essential sentences from the source text to form a summary, while abstractive

summarisation generates new sentences that capture the essence. Abstractive methods include structure-based,

maintaining original structure, and semantic-based, creating new sentences with similar meanings. The article

equips readers to understand diﬀerent summarisation strategies and their applications. The third article, ”Guide

to Practical Machine Learning for Astronomy - Part I” by Linn Abraham, provides a practical guide for those

in a scientiﬁc background interested in entering machine learning. It outlines the stages of a machine learning

project, covering technical setup, programming languages, and code version control. The article also introduces

resources for learning Python, machine learning courses, and using GitHub. It emphasises the importance of

version control using Git and highlights helpful resources for learning and development. The fourth article,

”The Hertzsprung-Russell Diagram: Exploring Stellar Evolution and Diversity” by Robin Jacob Roy, introduces

the Hertzsprung-Russell (HR) diagram, a fundamental tool in astronomy. The diagram categorises stars based

on luminosity, temperature, spectral type, and evolutionary stage, revealing insights into their properties and

life cycles. It discusses the main features of the HR diagram, such as luminosity, temperature, spectral type, and

evolutionary stage. It explains its use in understanding diﬀerent types of stars, including main sequence stars,

red giants, supergiants, white dwarfs, and more. The HR diagram is a crucial tool for comprehending stellar

evolution and diversity. The ﬁfth article, ”X-ray Binaries” by Sindhu G, discusses X-ray binaries, a category

of binary star system containing a compact object (neutron star or black hole) and a companion star. These

binaries emit X-ray radiation due to material accumulation onto the compact object, often through processes like

Roche lobe overﬂow. They can be classiﬁed into low-mass X-ray binaries (LMXBs), high-mass X-ray binaries

(HMXBs), and intermediate-mass X-ray binaries (IXRBs) based on the mass of the companion star. LMXBs

involve low-mass stars transferring material through Roche lobe overﬂow, HMXBs consist of massive stars with

strong stellar winds, and IXRBs feature intermediate-mass stars as donors. These binaries provide insights into

accretion processes and interactions between stars and compact objects, contributing to our understanding of the

universe. The sixth article, ”Microﬂora of the Intestine,” by Geetha Paul, discusses the complex gut microbiota

ecosystem, which consists of over 400 to 1000 bacterial species in the human digestive tract. It highlights how

this ecosystem impacts digestion, nutrient absorption, immune modulation, and protection against pathogens.

Factors like diet, genetics, and environment inﬂuence the microbiota composition. Dysbiosis, an imbalance in

the microbiota, is linked to various health conditions. The article also covers advanced techniques, such as DNA

sequencing and metabolomics, used to study and understand the gut microbiota’s roles in maintaining health and

iii

causing diseases. The last article, ”Understanding Convolutional Neural Networks” by Ninan Sajeeth Philip,

explains the signiﬁcance and functioning of Convolutional Neural Networks (CNNs). CNNs are a class of neural

networks well-suited for grid-like data, particularly images. The article delves into how CNNs automatically

generate features from images and discusses their architecture, including convolutional and pooling layers. It

also covers key concepts such as activation functions, strides, padding, and ﬂattening. The article provides a

code example using TensorFlow and Keras to create a simple CNN model for image classiﬁcation using the

CIFAR-10 dataset.

Contents

Editorial ii

I Artiﬁcial Intelligence and Machine Learning 1

1 Diﬀerence Boosted Neural Network(DBNN) - Part 4 2

1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.2 Multiple feature connections as likelihood . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.3 Code Availability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2 From Information Overload to Clarity: The Power of Text Summarization (Part 2) 5

2.1 Construction of an intermediate representation of the input text . . . . . . . . . . . . . . . . . 6

2.2 Scoring the sentences based on the representation . . . . . . . . . . . . . . . . . . . . . . . 6

2.3 Selection of a summary comprising several sentences . . . . . . . . . . . . . . . . . . . . . . 7

2.4 Structure-Based Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.5 Semantic-Based Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

II Astronomy and Astrophysics 10

1 Guide to Practical Machine Learning for Astronomy - Part I 11

1.1 Stages of a Machine Learning Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

1.2 Setting up your computer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

1.3 The programming setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

1.4 Code version control and Github . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

1.5 Where to get help? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2 The Hertzsprung-Russell Diagram: Exploring Stellar Evolution and Diversity 15

2.1 Key Features of the HR Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.2 Various Objects on the HR Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

3 A Journey into the Enigmatic Universe of X-ray Binaries 19

3.1 X-ray binaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

3.2 Classiﬁcation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

III Biosciences 23

1 Microﬂora of the intestine 24

1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

1.2 Composition of the normal gut biota . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

1.3 Factors that contribute to the composition of the typical gut microbiota . . . . . . . . . . . . 25

1.4 Current methods to study gut microbiota . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

CONTENTS

1.5 Bioinformatics Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

1.6 Metabolomics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

1.7 Integrated Data Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

IV Computer Programming 30

1 Understanding Convolutional Neural Networks 31

Part I

Artiﬁcial Intelligence and Machine Learning

Diﬀerence Boosted Neural Network(DBNN) -

Part 4

by Blesson George

airis4D, Vol.1, No.9, 2023

www.airis4d.com

1.1 Introduction

In the preceding discussions, we have thoroughly explored a range of distinct aspects and applications

associated with the DBNN (Diﬀerence Boosted Neural Network) architecture. However, the scope of this

current article pivots towards an intriguing extension of this network paradigm, speciﬁcally denoted as the

”Enhanced Diﬀerence Boosted Neural Network” or E-DBNN for brevity.

We are placing signiﬁcant emphasis on two fundamental principles that underlie the DBNN framework.

These principles are the imposition of conditional independence and the utilization of diﬀerence boosting

technique. Notably, the core necessity of a Naive Bayes classiﬁer, which hinges on the concept of conditional

independence, is eﬀectively satisﬁed within the DBNN framework through the implementation of imposed

conditional independence mechanisms.

Within the framework of DBNN, the fulﬁllment of conditional independence is achieved by utilizing class-

based joint probabilities as derived features. On the other hand, in the context of E-DBNN, the derived feature

is constructed by considering the joint occurrence of multiple features within distinct bins corresponding to

diﬀerent classes. This approach serves to ﬁrmly establish the notion of conditional independence.

This underlying assumption draws upon the notion that, in the majority of classiﬁcation scenarios, there

exists an overlap among several features. It is the features that contribute to discrimination that create distinctions

between diverse entities. By amplifying these feature values, a more nuanced and insightful outcome can be

derived compared to the analysis of linear features alone.

1.2 Multiple feature connections as likelihood

Bayesian Classiﬁer networks, built upon the foundational principles of Bayes’ theorem, undertake the task

of classifying provided data denoted as x by employing the following equation:

C(x) = argmax(P (c) × P (x

, x

..., x

|c)

where C(x) represent the class variable. Considering the conditional independence, the above equation becomes,

C(x) = argmax(P (c) × {P (x

|c) ∗ P (x

|c)...P (x

|c)}

In the DBNN approach, a diﬀerent strategy is employed. Here, the features undergo a discretization

1.2 Multiple feature connections as likelihood

Figure 1

process, wherein the feature values are categorized into distinct bins. Subsequently, the feature values within

these bins are sorted into class-speciﬁc categories. Rather than directly utilizing the original feature values,

the approach involves the utilization of histograms containing feature values from various bins, each aligned

with diﬀerent classes. This histogram-based representation is then employed as the likelihood component in the

classiﬁcation process.

The likelihood for a conditionally dependent event A can be approximated as the product of the likelihood

of its paired input features.

In the context of E-DBNN, the computation of likelihood takes the form of a histogram capturing multiple

feature interactions. Following the discretization process, the feature vector within the training data is represented

as a two-dimensional grid denoted as F

i,m

. Here, the rows correspond to distinct bins (m), while the columns

represent individual features (i). Consequently, in a dataset comprising n feature vectors, there emerge n

two-dimensional structures.

The focus then shifts towards examining the interconnections between these features. The resulting

histogram of these connections assumes the role of the conditional probability distribution, forming an integral

part of the likelihood computation process.

Consider an input dataset where the value of Feature A is placed within the ﬁrst bin, Feature B’s value falls

into the third bin, and Feature C’s value is categorized into the fourth bin. This relationship can be symbolically

represented as ζ(A − 1B

). Furthermore, if there exists a fourth feature, Feature D, and its value is assigned

to the second bin, another conceivable connection emerges, denoted as ζ(B

In the scenario where two feature connections are being examined, they are denoted as ζ(A

), ζ(B

and ζ(C

). These connections can be generally indicated as F

i+j,m

where ”j” varies within the range from

1 to N

. N

denotes the number of features involved in the connection.

The quantity of features amalgamated to establish a connection is deﬁned as a hyper-parameter of the

network. The choice of this parameter is based on the observation that as features are binned, their conditional

interdependence on other features gradually coalesces into a few connections.

Throughout the training phase, all such connections within the data are quantiﬁed and represented as

ij,m

. Furthermore, if the data encompasses N classes, similar connections can be constructed for each of

1.3 Code Availability

those classes.

For each class, these connections contribute to the establishment of conditional probabilities that deﬁne

the likelihood of features residing within speciﬁc bins.

The posterior value is approximated as the product of the likelihood values multiplied with prior weights

of all the con- nection involved in the feature vector.

1.3 Code Availability

The source code of E-DBNN in Python language is available at the github page - https://github.com/blessoncms/E-

DBNN

1.4 Summary

E-DBNN is characterized as a Bayesian classiﬁer that attains enhanced performance outcomes through

the strategic utilization of imposed conditional independence and diﬀerence boosting methodologies. By

conducting a comparative analysis, we have evaluated the accuracy of E-DBNN against alternative approaches

grounded in Naive Bayes principles. The ﬁndings of this comparison indicate a consistent and systematic

enhancement in the obtained results, thereby underscoring the eﬃcacy of the E-DBNN approach.

About the Author

Blesson George is currently working as Assistant Professor of Physics at CMS College Kottayam,

Kerala. His research interests include developing machine learning algorithms and application of machine

learning techniques in protein studies.

From Information Overload to Clarity: The

Power of Text Summarization (Part 2)

by Jinsu Ann Mathew

airis4D, Vol.1, No.9, 2023

www.airis4d.com

In the last article, we explored diﬀerent techniques for condensing text. We focused on two key aspects:

selecting what to summarize based on the type of input and the intended goal. Now, in this article, we are

shifting our focus to the last type of summarization, which centers on the desired output. This comprises two

methods: extractive summarization and abstractive summarization (Figure 1).

Extractive summarization involves selecting and extracting existing sentences or phrases directly from the

source text to create a summary. These selected sentences are usually the most important, informative, or

representative ones from the original text. This approach doesn’t involve generating new sentences; instead, it

focuses on choosing the most relevant parts that maintain the essence of the original content. Think of it as a

process of ”copying and pasting” signiﬁcant portions from the source text to create a summary.

Abstractive summarization is a more advanced approach that involves generating new sentences that may not

exist in the source text. Instead of simply copying sentences, abstractive summarization involves understanding

the content of the source text and using that understanding to create concise and coherent summaries in a

more human-like manner. This method requires natural language generation techniques and often involves

paraphrasing and rephrasing to capture the core ideas while using diﬀerent words and structures.

In essence, extractive summarization is like selecting puzzle pieces from the original picture, while

abstractive summarization is like creating a new puzzle using the same theme and colors but rearranging the

(image courtesy: https://www.abstractivehealth.com/extractive-vs-abstractive-summarization-in- healthcare)

Figure 1: Extractive and Abstractive summarization

2.1 Construction of an intermediate representation of the input text

pieces to form a coherent and concise representation. Both approaches have their strengths and challenges, and

researchers continue to explore ways to improve their eﬀectiveness and eﬃciency. In this article, we will closely

examine these two methods.

Basic tasks of extractive summarization

Extractive summarization systems are a type of text summarization that selects a subset of sentences from

the input text to form a summary. They are centered on three core operations: 1) Construction of an intermediate

representation of the input text, 2) Scoring the sentences based on the representation, 3) Selection of a summary

comprising several sentences. In the following section, we will discuss each of these operations in more detail.

2.1 Construction of an intermediate representation of the input text

When constructing an intermediate representation of the input text for extractive summarization, the goal is

to transform the textual content into a structured format that captures the essential information of each sentence.

This representation serves as a foundation for subsequent steps like scoring and selection. To achieve this, each

sentence from the tokenized text is assigned a numerical representation. This is often accomplished through the

use of word embeddings, such as Word2Vec or GloVe, which convert words into multi-dimensional vectors in a

semantic space. By averaging or combining these word vectors, sentence representations are formed, capturing

the semantic context of the sentence’s content.

Once individual sentence representations are established, they are combined or aggregated to generate

a uniﬁed representation of the entire input text. Various methods can be employed for this aggregation,

including simple averaging of sentence vectors, weighted averages that consider sentence importance, or more

complex mathematical techniques like Principal Component Analysis (PCA) or clustering. This aggregated

representation transforms the textual content into a structured data format, such as a matrix, where each row

corresponds to a sentence and each column represents diﬀerent dimensions or features.

In essence, the creation of the intermediate representation involves converting the textual information into

a structured numerical format that captures semantic meaning. This representation enables eﬀective analysis

of the source text and plays a vital role in generating meaningful and coherent extractive summaries. It serves

as the intermediary step that bridges the gap between the raw textual data and the subsequent computational

processes required for extractive summarization.

2.2 Scoring the sentences based on the representation

This step involves assessing the signiﬁcance and relevance of each sentence in the source text through

the use of numerical values derived from its transformed representation.The scores assigned to sentences are

determined by a range of metrics and methodologies. One such metric is TF-IDF (Term Frequency-Inverse

Document Frequency), which gauges the importance of a word within a sentence relative to its frequency in the

entire document. Sentences containing rare but signiﬁcant terms can thus attain higher scores. Additionally,

semantic analysis techniques come into play, leveraging natural language processing and machine learning

algorithms to comprehend the contextual import of sentences in relation to the broader text.

Positional and structural elements also contribute to scoring. Introductory sentences or conclusive remarks

may carry greater weight due to their positions within the text’s narrative ﬂow. Furthermore, structural cues

2.3 Selection of a summary comprising several sentences

such as headings might inﬂuence the scoring process. Weighted scoring, where speciﬁc features receive varying

degrees of importance, allows terms of exceptional signiﬁcance to hold more weight in the ﬁnal score calculation.

Ultimately, the scoring of sentences based on their representation serves as a compass, guiding the

summarization system toward the selection of sentences that collectively construct a coherent and concise

summary reﬂective of the source text’s main content.

2.3 Selection of a summary comprising several sentences

The summarization system picks out the most important sentences to create a summary. Some methods

choose these sentences using step-by-step strategies, always going for the most crucial ones based on their

scores. Others approach the task like solving a puzzle, aiming to ﬁnd a group of sentences that make the

summary both important and clear while avoiding repetition.

One way to do this is by using greedy algorithms, which means the system goes through sentences one by

one, selecting the most important until it reaches the desired number of sentences, often represented as ’k’. This

way, it builds the summary step by step, making eﬃcient decisions along the way.

On the other hand, another approach treats sentence selection as a kind of puzzle-solving. The system tries

to ﬁnd the best combination of sentences that not only makes the summary informative and coherent but also

keeps it from being repetitive. This method often uses math and graphs to ﬁgure out the best mix of sentences

that work well together.

Overall, the goal is to create a summary that captures the main points, reads smoothly, and doesn’t repeat

itself. It’s about ﬁnding the right balance using clever strategies to build a summary that mirrors the original

text’s essence.

Abstractive summarization

Now, let’s dive into the world of abstractive summarization. This is where things get interesting because

we’re not just copying sentences anymore. Abstractive summarization allows us to express ourselves more

creatively while making sure we still catch the main ideas.We’ll look at two diﬀerent ways to do this: one that

focuses on keeping the structure of the original text (Structure based approach) and another that’s all about

understanding the meaning and expressing it in a fresh way( semantic based approach) (Figure 2). it’s like

exploring two paths to create summaries that stand out and convey the important stuﬀ in a cool new style.

2.4 Structure-Based Approach

The structure-based approach in abstractive summarization emphasizes the preservation of the underlying

structure of the source text. This method aims to generate a summary that not only captures the main ideas but

also follows a coherent narrative ﬂow similar to the original text. It often involves reorganizing and paraphrasing

sentences while maintaining the logical and sequential arrangement of ideas.

In this approach, the summarization system ﬁrst analyzes the structural elements of the source text, such

as headings, subheadings, paragraph divisions, and transitions between sections. Then, it employs techniques

like syntactic analysis and grammar parsing to understand how sentences are interconnected. The system might

generate sentences that are not verbatim repetitions from the source text but capture the essence and ﬂow of the

content.

2.5 Semantic-Based Approach

(image courtesy: https://www.researchgate.net/ﬁgure/Overview-of-Abstractive- Summarization-This- paper-collectively- summarizes-the- major ﬁg4 305912913)

Figure 2: Two types of abstractive summarization

2.5 Semantic-Based Approach

The semantic-based approach in abstractive summarization focuses on understanding the meaning of the

source text and generating summaries that convey this meaning using diﬀerent words and phrasing. This method

goes beyond surface-level paraphrasing and aims to generate novel sentences that capture the same semantic

information as the original text.

In this approach, the summarization system employs natural language understanding techniques, such as

deep learning models and neural networks. These models are trained on large amounts of text data to learn the

relationships between words, phrases, and concepts. The system uses this learned understanding to generate

coherent and contextually appropriate sentences that express the key ideas of the source text. Semantic-based

summarization often involves rephrasing, word substitution, and even generating entirely new sentences to

convey the intended meaning.

To summarize, the structure-based approach focuses on maintaining the original text’s structural organi-

zation while rephrasing and rearranging sentences for coherence. The semantic-based approach, on the other

hand, aims to capture the underlying meaning of the text and generates novel sentences that convey the same

concepts. Both approaches contribute to the challenging task of abstractive summarization, where the goal is

to create concise and human-like summaries that capture the core ideas of the source material.

Conclusion

In the world of making summaries from text, we’ve been exploring diﬀerent ways to capture the main ideas

and make things shorter. We’ve learned how to condense information to match various needs and goals. As we

conclude our exploration, we ﬁnd ourselves equipped with a good understanding of three fundamental types of

summarization techniques: those based on input, output type, and purpose.

Exploring both single and multiple documents, we’ve seen how to make information shorter. Whether

from an individual source or a collection of related documents, the ability to distill key information equips us to

navigate the sea of content more eﬃciently.

Unveiling the concept of summarization based on purpose, we’ve been introduced to the dynamic trio

of generic, query-based, and domain-speciﬁc summarization.Every type has its own special way of making

summaries that match what users need. This can be as general as getting the main ideas, or as speciﬁc as

2.5 Semantic-Based Approach

answering questions or focusing on particular topics.

Introducing the idea of summarization focused on the output, we delve into two techniques known as

extractive and abstractive summarization. These approaches assist in selecting crucial sentences or generating

fresh ones to grasp the key ideas.

All these techniques contribute signiﬁcantly to enhancing our understanding of text summarization. By

delving into these methods, we gain valuable insights into the various ways to condense information eﬀectively.

These approaches collectively equip us with the knowledge and tools needed to master the art of creating concise

and meaningful summaries.

References

Text Summarization in Natural Language Processing,Sakshi Kulkarni, medium, May,2022

An Introduction to Text Summarization using the TextRank Algorithm,Prateek Joshi , analytics vidhya,

May,2023

Extractive vs Abstractive Summarization in Healthcare,Vince Hartman, abstractive health, Novem-

ber,2022

Introduction to Extractive and Abstractive Summarization Techniques,Adrien Payong, paperspace, May,2023

Towards Automatic Text Summarization: Extractive Methods, medium, January,2019

Approaches to Text Summarization: An Overview,Matthew Mayo, KDnuggets,November,2022

Concept of Text Summarization,Dilip Valeti, medium, October,2021

About the Author

Jinsu Ann Mathew is a research scholar in Natural Language Processing and Chemical Informatics.

Her interests include applying basic scientiﬁc research on computational linguistics, practical applications of

human language technology, and interdisciplinary work in computational physics.

Part II

Astronomy and Astrophysics

Guide to Practical Machine Learning for

Astronomy - Part I

by Linn Abraham

airis4D, Vol.1, No.9, 2023

www.airis4d.com

This article is meant to be a guide for people from a scientiﬁc background who are interested in venturing

into the ﬁeld of machine learning but lacking the technical know-how. This is the ﬁrst in a series of articles

trying to achieve the goal. Before we go in-depth, let us ﬁrst try to summarize the journey of a machine learning

researcher working on solving an astrophysical problem.

1.1 Stages of a Machine Learning Project

1. The ﬁrst stage of the machine learning researcher’s journey involves setting up his or her computer.

This involves deciding on the hardware requirements, operating system, programming language and

frameworks to use. There are also add on modules or libraries that are helpful to install, as well as IDEs

and other tools that makes your life easier.

2. The second stage is often the most diﬃcult. Here you need to deﬁne a scientiﬁc problem to solve using

ML or DL. A survey of scientiﬁc problems that exist in your ﬁeld of study is a necessary pre-requisite.

You also need to have a good grasp of the capabilities and limitations of ML or DL techniques to make a

decision here.

3. If you have a well-deﬁned problem, then in the third stage, you need to worry about the data and algorithms.

Making visualization of your data helps a lot towards understanding its nature as well as deciding which

methods to use. Several parameters like the size and availability of your data, the complexity of the

problem and the computational time and expense that are aﬀordable etc. go into deciding the methods to

be used.

4. In the next stage, you need to process your data before you pass it your ML/DL model. You also need to

design and implement your model. Both these things require writing good code.

5. The next stage consists of the model ﬁtting or the training stage. This is followed by the evaluation of

your model performance.

6. The ﬁnal stage is where you use all the feedback to go back and make improvements in your model so as

to increase its performance.

The rest of the article deals with ﬁrst stage of the journey which is all about having the proper technical

setup.

1.2 Setting up your computer

You need a computer at two diﬀerent stages of the journey - one while developing your code and one for

running the training of your ML algorithms. You need a much more powerful system for running most modern

ML/DL algorithms that what is required for developing the code. Thankfully there are free and paid options

available to do the former that are much better than owning your own hardware. For the latter, if you have the

budget to own a laptop with a dedicated graphics card, make sure you only choose laptops that supports the

NVIDIA GPUs. But for developing code you do not need a beefy system. Even a machine which just has an

Intel i3 processor or its Ryzen equivalent and a 4GB RAM would suﬃce. If your laptop comes preinstalled

with Windows, try to get it to dual boot a Linux OS. Linux is your best friend when it comes to the world of

code development. The term Linux doesn’t refer to any particular OS that you can install. However there are

various ﬂavours or distributions of Linux which you can actually install on your machine. Ubuntu is one of the

most popular Linux distributions. However if you are looking for something more interesting that Ubuntu, Arch

based distributions like Endeavour OS are a good option. The Arch User Repository or (AUR) is one of the best

features of Arch Linux that makes it easy for beginners to install several packages that are most often missing

from oﬃcial Linux repositories. Once you are comfortable with using Linux and you feel as if you want to get

into the inner workings, one of the best ways to start is by installing the [Arch Linux] OS. You can do this on

something like a spare pendrive by following the arch installation guide.

1.3 The programming setup

Once the OS is setup, deciding on a programming language to use is the next step. You do not have to

make much of a decision here. Python is the language that is most suited for machine learning development. It

comes pre-installed on most Linux distributions. One of the main advantages of Python is it simplicity and free

and open-source nature. This has gained Python a large number of users that have contributed a lot of code.

These are available as modules that are easily installed from a central repository. The most commonly used

modules for ML developers are the following: Numpy, Pandas, Matplotlib, Scipy, Scikit-Learn, PIL, OpenCV

and Tensorﬂow. The program that helps you install packages from the [Python Packaging Index] is called

PIP an acronym for Pip Installs Packages. Sometimes if you are required to use a Python version that is not

already installed and you do not have suﬃcient permission to install it, the [Miniconda] distribution becomes

a useful tool. Conda environments are slightly diﬀerent from ordinary Python virtual environments because

they are capable of installing diﬀerent versions of Python as well non-python dependencies that are sometimes

required for installing Python packages. It does this using a container like environment that keeps the installed

dependencies separate from the already installed ones in your system.

How do you start writing python code? There are two ways in which people write machine learning code.

The ﬁrst way uses scripts - which are basically a text ﬁle with a .py extension. Any good old text editor would

suﬃce. However there are other software that can also come in handy. Terminal based editors like VIM, Nano,

Emacs, etc. can come in handy when you have to run code on remote servers where there is no graphical

interface. The second way is using Jupyter Notebooks - which are a custom JSON format ﬁle with a .ipynb

extension. Jupyter Notebooks enable the user to save both the code and the output of each line that can be

opened in a web browser. There also exists IDEs other than Jupyter Notebook that are worth considering.

VSCode and Sublime are two such options. Before you start using Python on your Linux be sure to check out

how to create a Python virtual environment. Always make sure that you are installing python packages inside a

virtual environment.

1.4 Code version control and Github

An important piece of technology that can make your life as a coder much easier in the long run is a version

control system. When you write code, you often ﬁnd the need to undo your work. If you have saved your work

already, you might ﬁnd it diﬃcult to revert back unless you have saved versions of your code at diﬀerent times.

Creating such versions manually can quickly become a hectic thing to manage. Git is the most popular version

control system. It allows you to make manual checkpoints (called commits) in your Git repository. A repository

in git is just the primary folder in your system that contains all of the data and code related to your machine

learning project. Keep in mind that Git is designed to track only text ﬁles and not image or binary data. Do not

version control your data using Git. There are other options that you can check out if you are interested in version

controlling your models or data (eg. [Data Version Control]). When using Git, it is advisable to frequently

create git branches in your repository every time you think of making an improvement to your existing code. If

the path taken is not to your liking throw away the branch. If after a while you feel like the improvements made

are there to stay, merge the branch to your already existing best version (called the master or main branch).

The popular code sharing platform [Github] is a collection of git repositories made public by various people

and companies. ML researchers can use Github to share their code for other people to use and review. They can

also share their code privately with other developers or mentors while in the developmental stage using private

repositories. Github provides a cloning method based on Personal Access Tokens (PAT) that makes it very

trivial to clone your private repositories on any system. Code repos in github that make use of Python would

mostly have a requirements.txt ﬁle that shows which external packages are required to run the code. Create such

a ﬁle for your own project. It is also a good practice to create a README ﬁle that mentions how the scripts in

the repo can be run.

1.5 Where to get help?

In this section I will list resources you can use to learn about the things that has been already discussed

and more things to read.

1. [Scipy Lectures] - Getting started with Python and learning scientiﬁc packages like Numpy, Matplotlib

and Scipy.

2. [Exercism] - A website for learning the basics of Python using exercises.

3. [PyImageSearch] - Adrian’s blog for Computer Vision

4. [Coursera] & [Udacity] - Machine Learning Courses

5. [Google Colaboratory] - Free resource for developing and/or training ML models. Jupyter notebook

running on a virtual machine in the cloud. Note that a single code can run only upto a maximum of 12

hours.

6. [Python Packaging Index] - Repository of python packages

7. [Gdown] - Python package for downloading data from google drive using a shareable link

8. Deep Learning with Python, [Chollet(2018)] - Introduction to Deep Learning using Python and the Keras

framework

9. [Arch Linux Wiki] - Solution to most troubles you might face on Linux

10. [Luke Smith Youtube channel] - Getting around to using an Arch Linux based OS

REFERENCES

References

[Arch Linux] Arch Linux. https://archlinux.org/.

[Python Packaging Index] Python Packaging Index. https://pypi.org/.

[Miniconda] Miniconda. https://docs.conda.io/en/latest/miniconda.html.

[Data Version Control] Data Version Control. https://dvc.org/.

[Github] Github. https://github.com/.

[Scipy Lectures] Scipy Lectures. https://scipy-lectures.org/.

[Exercism] Exercism. https://exercism.org/.

[PyImageSearch] PyImageSearch. https://pyimagesearch.com/.

[Coursera] Coursera. https://coursera.org/.

[Udacity] Udacity. https://www.udacity.com/.

[Google Colaboratory] Google Colaboratory. https://research.google.com/colaboratory/.

[Gdown] Gdown. https://pypi.org/project/gdown/.

[Chollet(2018)] Franc¸ois Chollet. Deep Learning with Python. Manning Publications Co, Shelter Island, New

York, 2018. ISBN 978-1-61729-443-3.

[Arch Linux Wiki] Arch Linux Wiki. https://wiki.archlinux.org/.

[Luke Smith Youtube channel] Luke Smith Youtube channel. https://www.youtube.com/@LukeSmithxyz.

About the Author

Linn Abraham is a researcher in Physics, specializing in A.I. applications to astronomy. He is

currently involved in the development of CNN based Computer Vision tools for classiﬁcations of astronomical

sources from PanSTARRS optical images. He has used data from a several large astronomical surveys including

SDSS, CRTS, ZTF and PanSTARRS for his research.

The Hertzsprung-Russell Diagram:

Exploring Stellar Evolution and Diversity

by Robin Jacob Roy

airis4D, Vol.1, No.9, 2023

www.airis4d.com

Stars, like the broader Universe, undergo changes over time. They come into existence within areas

of concentrated gas, where the accumulation process is activated by external factors like the inﬂuence of

neighboring supernovae. When numerous stars take shape in close proximity and typically within a similar

timeframe, this gathering is termed a star cluster. These clusters, referred to as star clusters, can exhibit varying

levels of mass and metal content based on the composition of their initial gas clouds. Astronomers employ the

Hertzsprung-Russell diagram to map out the developmental phase of a star.

The Hertzsprung-Russell diagram, commonly referred to as the HR diagram is a fundamental tool in

astronomy that provides insights into the characteristics and evolution of stars. Named after its creators, Danish

astronomer Ejnar Hertzsprung and American astronomer Henry Norris Russell, this graphical representation

is a powerful tool for categorizing stars based on their luminosity, temperature, spectral type, and evolutionary

stage. The HR diagram has revolutionized our understanding of stellar properties and their life cycles, and it

continues to be a cornerstone of astrophysical research. The HR diagram is shown in ﬁgure 1

Figure 1: The Hertzsprung-Russell diagram is a graphical representation that plots the temperatures of stars

against their luminosities. The position of a star in the diagram provides information about its present stage and

its mass. Source: ESO

2.1 Key Features of the HR Diagram

Figure 2: Comparing the Structure and Evolution of the Sun: From its Present State to its Future as a Red

Giant. Source: ESO

2.1 Key Features of the HR Diagram

1. Luminosity (Absolute Brightness): The luminosity of a star represents the total amount of energy it

radiates into space. It is plotted on the vertical axis of the HR diagram and is usually presented on a

logarithmic scale.

2. Temperature (Spectral Type or Color): The temperature of a star aﬀects its color and spectral character-

istics. The temperature scale is typically shown on the horizontal axis, ranging from cooler red stars to

hotter blue stars.

3. Spectral Type: The spectral type of a star is determined by its surface temperature and is usually classiﬁed

using the letters O, B, A, F, G, K, and M. These letters correspond to speciﬁc temperature ranges, with O

being the hottest and M being the coolest.

4. Evolutionary Stage: The HR diagram allows astronomers to track the evolution of stars from their birth

to their eventual death. Stars follow distinct paths on the diagram as they change over time, transitioning

from the main sequence to various evolutionary stages such as red giants, supergiants, and white dwarfs.

2.2 Various Objects on the HR Diagram

Main Sequence Stars: Main sequence stars, often referred to as the workhorses of the cosmos, represent

the most abundant and long-lasting stage in a star’s life cycle. These stars, including our Sun, achieve

equilibrium between gravitational collapse and nuclear fusion in their cores, emitting a steady stream of

energy in the form of light and heat. Their luminosity and surface temperature follow a well-deﬁned

relationship, creating the iconic diagonal band on the Hertzsprung-Russell diagram. Main sequence stars

fuse hydrogen into helium, sustaining their brilliance for billions of years while serving as cosmic beacons

that shape the fundamental properties of galaxies and the evolution of the universe.

Red Giants and Supergiants: As stars exhaust their hydrogen fuel, they expand and cool, moving away

from the main sequence toward the upper right of the HR diagram. Red giants and supergiants are larger

and more luminous than main sequence stars, with evolved cores undergoing helium fusion or other

nuclear reactions. In about 6 billion years, as the Sun enters its red giant phase, the depletion of fuel

within its core will lead to a gradual slowdown of hydrogen fusion. Consequently, the core will undergo a

contraction process, increasing its temperature until it triggers a renewed phase of nuclear fusion. During

this phase, helium will fuse into more complex elements like carbon, nitrogen, and oxygen. Concurrently,

the elevated core temperature will drive hydrogen fusion in the surrounding ”shell” of material enveloping

the core. Simultaneously, internal heat generation deep within the star will result in the expansion of its

2.2 Various Objects on the HR Diagram

Figure 3: Captured by the Hubble Space Telescope, the image portrays Sirius A and Sirius B. In the image,

the much brighter Sirius A is prominent, while the fainter Sirius B, a white dwarf, is visible as a subtle point of

light positioned towards the lower left of Sirius A.. Source: ESA

outer gas layer. The ﬁgure 2 depicts a size comparison of our Sun in its current phase and its red giant

phase.

White Dwarfs: These are the remnants of low- to medium-mass stars that have exhausted their nuclear

fuel and undergone gravitational collapse. These incredibly dense objects pack a mass comparable to that

of the Sun into a volume roughly equivalent to that of Earth. Supported by electron degeneracy pressure,

white dwarfs are stable and gradually cool over billions of years, transitioning from a luminous and hot

state to becoming fainter and cooler. Their evolutionary paths are inﬂuenced by their initial masses and

the processes leading to their formation, making them crucial in understanding stellar life cycles and the

ultimate fate of stars. Figure 3 illustrates a size comparison between Sirius B, a white dwarf, and its

companion, Sirius A.

Protostars and Pre-Main Sequence Stars: Objects that are still in the process of gravitational contraction

and heating before they start hydrogen fusion fall in the protostar or pre-main sequence region of the

HR diagram. These objects are often deeply embedded in gas and dust clouds and can exhibit strong

variability.

Blue Supergiants and Wolf-Rayet Stars: These extremely luminous stars are massive and hot, often located

in the upper-left region of the HR diagram. They have high rates of mass loss due to strong stellar winds.

Variable Stars: Stars with irregular or periodic variations in brightness, such as Cepheid variables and

RR Lyrae stars, can be found in speciﬁc regions of the HR diagram. These luminosity ﬂuctuations can

stem from various intrinsic or extrinsic factors, such as pulsations, eclipses, or eruptions. By monitoring

the patterns of variation, astronomers can deduce crucial information about the stars’ properties, includ-

ing their distances, sizes, temperatures, and evolutionary stages. These stars are crucial for distance

measurements and studying cosmic distances.

In conclusion, the Hertzsprung-Russell diagram serves as a visual roadmap for understanding the diverse

life cycles of stars, their characteristics, and their behaviors. By analyzing the distribution of stars on this

diagram, astronomers can glean valuable insights into the processes that shape the universe and deepen our

understanding of stellar evolution.

References:

HR diagram background

2.2 Various Objects on the HR Diagram

Hertzsprung-Russell Diagram

Lives and Deaths of Stars

Evolution of high mass stars

About the Author

Robin is a researcher in Physics specializing in the applications of Machine Learning for Astronomy

and Remote Sensing. He is particularly interested in using Computer Vision to address challenges in the ﬁelds of

Biodiversity, Protein studies, and Astronomy. He is presently engaged in utilizing machine learning techniques

for the identiﬁcation of star-forming knots.

A Journey into the Enigmatic Universe of

X-ray Binaries

by Sindhu G

airis4D, Vol.1, No.9, 2023

www.airis4d.com

3.1 X-ray binaries

X-ray binaries (Fig: 1) encompass a category of binary star systems comprising a compact object and a

companion star. The companion star has the possibility of being a main sequence star, a giant star, or, on rare

occasions, a white dwarf. The compact object is a stellar remnant, either a neutron star or a black hole. A binary

system with a white dwarf as the compact object is called a cataclysmic variable (CV). The accretor is the term

used for the compact object in an X-ray binary, while the donor is the term applied to the companion star.

The label ”X-ray binaries” originates from their substantial emission of X-ray radiation. Within X-ray

binaries, the accumulation of material onto the compact object can transpire through two principal processes:

Roche lobe overﬂow and stellar wind accretion. The gravitational attraction pulls matter from the companion

star towards the compact object. As this material descends onto the compact object, it coalesces into an

accretion disk—a dynamic disk of gas and particulates encircling the compact entity. The presence of friction

and gravitational forces within the accretion disk results in a considerable rise in material temperatures, giving

rise to the emission of X-rays.

3.2 Classiﬁcation

X-ray binaries can be categorized into three main types according to the mass of the star that is losing

mass, namely, the companion star.

3.2.1 Low-mass X-ray binaries (LMXBs)

Low-mass X-ray binaries, often referred to as LMXBs(Fig: 2), belong to a class of X-ray binaries where

the companion star is usually a low-mass star, such as a main sequence star or a white dwarf. The compact

object can be either a neutron star or a black hole. Low-mass X-ray binaries lack a strong stellar wind. In these

systems, mass transfer from the companion star to the compact object occurs through Roche lobe overﬂow.

As the companion star ﬁlls its Roche lobe, material surpassing its limits begins to overﬂow, gravitating

towards the compact object. This ejected material forms an accretion disk encircling the compact object,

creating a dynamic spiral inward due to gravitational forces. LMXBs are characterized by their relatively low

3.2 Classiﬁcation

Figure 1: Artist’s conception of an X-ray binary systemSource: NASA/GSFC.

Figure 2: Low Mass X-ray Binary Source: NASA.

3.2 Classiﬁcation

Figure 3: An artist’s impression of a High Mass X-ray BinarySource: NASA/CXC/M.Weiss.

X-ray luminosity. Around 200 LMXBs are presently recognized within the Milky Way galaxy, with 13 having

been identiﬁed within globular clusters.

Typically, LMXBs emit the majority of their radiation in the X-ray spectrum, with visible light emissions

signiﬁcantly fainter, constituting less than one percent of the total radiation output. These binaries typically

exhibit apparent magnitudes ranging from 15 to 20. The orbital periods of LMXBs can exhibit a wide range,

spanning from brief durations of ten minutes to extensive periods lasting hundreds of days.

In LMXBs, the companion star tends to be a late-type star, encompassing spectral types F, G, K, and M.

These stars are cooler and less massive than early-type counterparts like A-type stars. The variability of LMXBs

is often observed through phenomena such as X-ray bursts, known as X-ray bursters, as well as X-ray pulsations.

3.2.2 High-mass X-ray binaries (HMXBs)

High-mass X-ray binaries, or HMXBs(Fig: 2), involve a companion star of considerable mass, typically a

massive star within its main sequence or giant phase. The compact object in HMXBs can either be a neutron

star or a black hole. The massive companion star in HMXBs produces strong stellar winds because of its intense

radiation and high luminosity. These winds carry material away from the companion star. A portion of this

material may be captured by the gravitational force of the compact object, resulting in the formation of an

accretion disk or a direct ﬂow onto the compact object.

Within HMXBs, the massive star—often an O or B type star—typically governs the emission of optical

light, while the compact object serves as the primary source of X-rays. Variability within HMXBs is evident

through phenomena such as X-ray pulsars rather than X-ray bursters. Notably, HMXBs generally display a

greater X-ray luminosity in comparison to LMXBs.

3.2.3 Intermediate-mass X-ray binary (IXRBs)

As the name suggests, IXRBs have characteristics that fall between those of LMXBs and HMXBs. The

donor star in an IXRB is typically an intermediate-mass star, such as a F or G-type star, with a mass higher than

that of a low-mass star but lower than that of a massive star found in HMXBs. The compact object in an IXRB

can be a neutron star or a black hole. These systems can show a variety of X-ray behaviors and provide valuable

insights into the accretion processes and interactions between stars and compact objects.

3.2 Classiﬁcation

References:

Formation of millisecond pulsars with heavy white dwarf companions: Extreme mass transfer on sub-

thermal timescales.

Evolutionary sequences for low- and intermediate-mass X-ray binaries

A catalogue of low-mass X-ray binaries in the galaxy, LMC,and SMC (Fourth edition)

WATCHDOG: A comprehensive all-sky database of galactic black hole X-ray binaries

X-ray binary stars

X Ray binaries monitoring

X-ray binary

About the Author

Sindhu G is a research scholar in Physics doing research in Astronomy & Astrophysics. Her research

mainly focuses on classiﬁcation of variable stars using diﬀerent machine learning algorithms. She is also doing

the period prediction of diﬀerent types of variable stars, especially eclipsing binaries and on the study of optical

counterparts of x-ray binaries.

Part III

Biosciences

Microﬂora of the intestine

by Geetha Paul

airis4D, Vol.1, No.9, 2023

www.airis4d.com

1.1 Introduction

The intestinal microﬂora, also known as gut microbiota, is a complex ecosystem containing over 400 to

1000 bacterial species inhabiting the digestive tract of humans. The upper gastrointestinal tract, including

the stomach, duodenum, jejunum, and upper ileum, typically possesses a sparse microﬂora, with anaerobes

(microorganisms live in the absence of oxygen) outnumbering facultative anaerobes. The bacterial concentration

is less than 10

organisms/ml of intestinal secretions. The ﬂora is sparse in the stomach and upper intestine but

luxuriant in the lower bowel. Bacteria occur both in the lumen and attached to the mucosa but do not usually

penetrate the bowel wall. This intricate ecosystem encompasses various microorganisms coexisting within the

intestinal environment, such as bacteria, archaea, fungi, and viruses. The microﬂora is fundamental in essential

physiological processes and signiﬁcantly contributes to overall host health and functioning. The delicate balance

and composition of the intestinal microﬂora hold profound implications for vital functions, including digestion,

nutrient absorption, immune system modulation, and protection against harmful pathogens.

Within this community, microorganisms interact among themselves and with the host, forming complex

relationships that collectively maintain the stability and proper functioning of the gut environment. The

relationship between the gut microbiota and human health is being increasingly recognised. It is now well

established that a healthy gut ﬂora is mainly responsible for the overall health of the host. Diverse factors,

including diet, age, genetics, and environmental exposures, inﬂuence the composition of the intestinal microﬂora.

Variations in this microﬂora can impact individual health outcomes and susceptibility to various diseases. A

disruption in the balance of the microﬂora, known as dysbiosis, has been associated with various health

conditions, encompassing inﬂammatory bowel diseases, metabolic disorders, and mental health issues. Ongoing

research into the intestinal microﬂora has advanced our understanding of its intricate roles in preserving

homeostasis and has led to potential therapeutic interventions. Advances in sequencing technologies have

enabled the identiﬁcation and characterisation of speciﬁc microbial species and their functional roles within the

gut ecosystem. This knowledge has paved the way for developments in personalised medicine and strategies

aimed at modulating the gut microﬂora to enhance overall health and well-being. In summation, the intestinal

microﬂora is a dynamic and pivotal component of the gastrointestinal system, exerting substantial inﬂuence on

diverse aspects of host health and functioning. Its multifaceted interactions and roles remain subjects of active

research, holding promising avenues for expanding our comprehension of human health and disease.

1.2 Composition of the normal gut biota

(image courtesy: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4528021/)

Figure 1: Distribution of the average human gut ﬂora.

1.2 Composition of the normal gut biota

The gut microbiota is highly diverse, with recent research revealing a more extensive array of microbial

species and genetic potential than previously thought. Initially, the gut microbiota was believed to consist of

around 500-1000 species, but a recent study estimated over 35,000 bacterial species. The Human Microbiome

Project and MetaHIT studies indicate that there could be over 10 million non-redundant genes in the human

microbiome. Danish research introduced the concept of High Gene Count (HGC) and Low Gene Count

(LGC) microbiomes, impacting health and disease. The HGC microbiome has a robust composition, including

beneﬁcial bacteria like Akkermansia, associated with digestive health and lower metabolic disorders. LGC

individuals have a higher proportion of pro-inﬂammatory bacteria linked to conditions like inﬂammatory

bowel disease. The healthy gut microbiota consists mainly of Firmicutes, Bacteroidetes, Actinobacteria and

Verrucomicrobia. The composition varies along the gastrointestinal tract and from lumen to mucosal surface.

Longitudinal and axial diﬀerences in microbial presence contribute to the complexity of the gut microbiota.

This information underscores the gut microbiota’s intricate diversity and functional importance in health and

disease.

The ﬁgure explains the diﬀerential pH and occurrence of the type of bacteria in diﬀerent parts of the

intestine.

1.3 Factors that contribute to the composition of the typical gut microbiota

Birth Delivery Mode: The delivery method, whether vaginal or caesarean, can impact the initial gut

microbiota colonisation.

Infancy and Adult Diet: The diet followed during infancy, including breast milk or formula feeding, and

dietary choices in adulthood, such as vegan or meat-based diets, can inﬂuence the gut microbiota makeup.

1.4 Current methods to study gut microbiota

(image courtesy: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4528021/)

Figure 2: Concentration of the bacterial ﬂora in regions of the gastrointestinal tract.

Antibiotic Usage and Environmental Compounds: The utilisation of antibiotics or antibiotic-like com-

pounds, whether from the environment or the gut’s commensal community, is noteworthy. However, it raises

concerns regarding potential long-term disturbances to the healthy gut microbiota and the potential horizontal

transmission of resistance genes. This could create a reservoir of microorganisms housing a diverse pool of

multidrug-resistant genes.

It is crucial to acknowledge that antibiotics might disrupt the equilibrium of the gut microbiota, with

possible long-term consequences. Additionally, there is a risk of the horizontal transmission of antibiotic-

resistance genes among microorganisms, which could further contribute to developing multidrug-resistant

strains.

1.4 Current methods to study gut microbiota

Contemporary methods for investigating the gut microbiota involve advanced techniques that have revo-

lutionised our understanding of this complex microbial ecosystem. Traditional culture-based methods, which

faced limitations in isolating anaerobic microorganisms, have been supplemented and even replaced by more

comprehensive and eﬃcient methodologies:

1.4.1 DNA Isolation

Stool samples are collected from individuals, and DNA is extracted from the stool. The DNA contains

genetic information of the microbial community residing in the gut. High-throughput sequencing of a speciﬁc

microbial gene, usually the 16S rRNA gene, is performed. This gene is highly conserved among bacteria but

contains variable regions that allow distinguishing between diﬀerent species or genera. This method provides a

snapshot of the microbial diversity in the sample.

1.5 Bioinformatics Analysis

1.4.2 Bacterial Gene Sequencing

Sequencing of bacterial genes involves metagenomic analysis of DNA that codes for the 16S rRNA. The

16S region of the bacterial gene is small (1.5 Kb size) and highly conserved, with nine hypervariable sites

suﬃcient to diﬀerentiate various bacterial species.

Sequencing:

The extracted genetic material is subjected to high-throughput sequencing techniques like next-generation

sequencing (NGS). This results in a vast amount of sequencing data that needs analysis.

Data Quality Control and Preprocessing:

Raw sequencing data undergoes quality control steps to remove noise and errors. This includes trimming

low-quality reads and ﬁltering out artefacts.

Taxonomic and Functional Proﬁling:

Various tools are used to analyse the sequencing data. Taxonomic proﬁling identiﬁes the composition of

microbial species in the sample, often using tools like MetaPhlAn or MEGAN. Functional proﬁling assesses

the potential functional capabilities of the microbial community using tools like MG-RAST, KEGG, COG, or

TIGRFAM.

Host-Microbe Interaction Analysis:

Understanding the interaction between host and microorganisms is crucial. Tools like PICRUSt, FANTOM,

HUMAn, and CAZy can help predict the functional roles of the microbiota and identify any inﬂuence on host

metabolism.

Statistical Analysis:

Statistical methods are applied to analyse the data, identify signiﬁcant diﬀerences between groups, and

explore correlations between microbial species and host health conditions. R and Python libraries are often

used for this purpose.

Functional Annotation:

Functional annotation involves assigning speciﬁc functions to the identiﬁed genes or gene products. This

helps understand the potential roles of microbial communities in various biological processes.

1.5 Bioinformatics Analysis

Following sequencing, bioinformatics tools are employed to analyse the data. These tools help identify

the diﬀerent microbial species or groups in the sample based on the sequences obtained. By comparing the

sequences to existing databases, researchers can gain insights into the composition and relative abundance of

the gut microbiota.

This ﬁgure explains the various steps involved in the bioinformatics analysis, starting from collection of

samples, extraction, sequencing and statistical analysis. The interaction between host and microbes, along

with the functional capacity of the microbiota, can be studied. MG-RAST: Metagenomics rapid annotation

using subsystem technology; CAZy: Carbohydrate active-enzymes; MetaPhlAn: Metagenomic phylogenetic

analysis; KEGG: Kyoto encyclopaedia for genes and genomics; COG: Clusters of the orthologous group;

PICRUst: Phylogenetic investigation of communities by reconstruction of unobserved states; MEGAN: Meta

genome analyser; MEDUSA: Metagenomic data utilisation and analysis; FANTOM: Functional annotation

and taxonomic analysis of metagenomes; HUMAan: Human microbiome project uniﬁed metabolic analysis

network; BLAST: Basic local alignment search tool; TIGRFAM: Protein sequence classiﬁcation; PFAM: Protein

families; SOAP: Short oligonucleotide analysis package; QIIME: Quantitative insights into microbial ecology.

1.6 Metabolomics

(image courtesy: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4528021/)

Figure 3: Bioinformatics workﬂow.

1.6 Metabolomics

Metabolomics is an expanding ﬁeld that examines the small molecules produced by both the host and the

gut microbiota due to their interactions. These molecules play a role in various metabolic pathways and can

aﬀect health and disease. By studying the metabolome alongside the gut microbiota composition, researchers

can better understand how these microbial communities inﬂuence overall health.

1.7 Integrated Data Analysis

Combining data from microbial DNA sequencing and metabolomics provides a comprehensive overview of

the interplay between the gut microbiota and host metabolism. This integrated approach oﬀers a more accurate

assessment of the relationship between the gut microbiota and various health and disease states.

These advanced methods have several advantages over traditional culture-based techniques. They allow for

studying a broader range of microorganisms, including the previously challenging-to-culture anaerobic species.

Moreover, these methods are less time-consuming and provide a more detailed and accurate representation of

the complex microbial communities in the gut. As technology continues to evolve, our ability to unravel the

intricate connections between the gut microbiota, host health, and disease will continue to improve.

The workﬂow encompasses the entire process of turning raw biological samples into valuable insights

about the microbial communities and their interactions with the host.

References

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4528021/

https://www.sciencedirect.com/science/article/abs/pii/S1877117322000904

https://www.sciencedirect.com/science/article/abs/pii/B978012804024900029X

https://www.ncbi.nlm.nih.gov/books/NBK7670/#:˜:text

Peterson DA, Frank DN, Pace NR, Gordon JI. Metagenomic approaches for deﬁning the pathogenesis

of inﬂammatory bowel diseases. Cell Host Microbe. 2008;3:417–427. (https://pubmed.ncbi.nlm.nih.gov/

18541218)

1.7 Integrated Data Analysis

Hamady M, Walker JJ, Harris JK, Gold NJ, Knight R. Error-correcting barcoded primers for pyrosequencing

hundreds of samples in multiplexes. Nat Methods. 2008;5:235–237.

About the Author

Geetha Paul is one of the directors of airis4D. She leads the Biosciences Division. Her research

interests extends from Cell & Molecular Biology to Environmental Sciences, Odonatology, and Aquatic Biology.

Part IV

Computer Programming

Understanding Convolutional Neural

Networks

by Ninan Sajeeth Philip

airis4D, Vol.1, No.9, 2023

www.airis4d.com

Though evolved along with Artiﬁcial Neural Networks (ANN) that use extracted features and the back-

propagation algorithm for training a model to map between the input feature space and the output target space,

the Convolution Neural Network (CNN) got more attention later when computational resources became more

easily accessible. CNNs are best suited to handle grid-like data, an example of which are images. This is one

area in which ANNs struggled to perform even marginally due to the diﬃculty in extracting reliable features for

their classiﬁcation.

Let us consider face recognition as an example. The distance between the eyes and the ratios of various line

segments joining diﬀerent facial features can all be considered. But as soon as the camera’s orientation changes,

all these values change or sometimes become inaccessible. What is required would be a comprehensive model

that considers the correlation between every region of the face and all possible orientations. This is essentially

what CNNs do.

CNNs were ﬁrst introduced in the early 1980s, but they only became widely used in the 2010s when

advances in machine learning and computing power made it possible to train CNNs on large datasets of images.

CNNs have since been used to achieve state-of-the-art results on various image recognition tasks. We will try to

understand how CNNs can generate features automatically from images or, to that matter, any array of numbers

in this article.

(image courtesy: https://edri.org/our-work/facial-recognition-and-fundamental-rights-101/)

Figure 1: Extraction of features from an image - traditional way

(image courtesy: DOI:10.3390/app9102028)

Figure 2: A simple CNN Architecture

Though the concept of Convolution was widely known, the ﬁrst practical and popular application was

the LeNet-5 developed by Yann LeCun in 1998. Its success in the recognition of handwritten digits brought

it widespread recognition. However, due to the unavailability of computational power, ANNs remained more

popular in machine learning till 2010 when GPUs and massively parallel computing computers revolutionalised

the entire ﬁeld. Several computational competitions encouraged researchers to enhance the features of existing

CNNs. The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) held in 2012 witnessed the

groundbreaking success of AlexNet, developed by Alex Krizhevsky.

The CNN consists of several layers, each of which learns features from images hierarchically, making

recognising complex structures much more eﬃcient than manually extracted features used as input to ANNs.

This hierarchical decomposition also allows CNNs to learn the spatial relationship between pixels in an image.

Though image recognition was the central application area handled by CNNs, because of its impressive

properties to decompose data hierarchically, it is common to ﬁnd one-dimensional CNNs.

The CNN Architecture

Let us consider image recognition as an example to understand CNNs. The image can be a grayscale image

or a colour image. The greyscale image is a 2D array of numbers where each number indicates the pixel’s

brightness at that location. So, if the image has 32 pixels along the x-axis and another 32 along the y-axis, it

forms a 2D array of 32X32 pixels. For colour images, we have a separate array for each of the Red, Green

and Blue colours that form the image. Hence, it will combine three 2D arrays each for a colour. Hence, the

dimension is 32X32X3. This is the size of the input layer that accepts the input image.

The input layer of size 32x32 for a grey-scale image can be implemented in Python with the command

Conv2D(32, (3, 3), activation=’relu’, input shape=(32, 32, 1)). Here, the input shape (32,32,1) means that the

image has only one layer and is composed of 32 by 32 pixels. So, the only diﬀerence in the code regarding a

colour image is that instead of (32,32,1), the input shape will become (32,32,3).

As the name suggests, Convolution Neural Networks use Convolution to extract features from an image.

The convolution is done using a more miniature Matrix called the kernel. A typical kernel has a dimension

(3,3) to convolve the central pixel with those around it. For example, assume that the Kernel has the values 1

along the ﬁrst column, 0 along the second and -1 along the third. An illustration of the convolution operation is

shown in Figure 3.

Though the illustration is with a simple kernel and a simple image, note that in the actual CNN, each layer

will consist of many kernels designed to extract all valuable features from an image. Since deciding the useful

kernels a priori is impossible, we leave that task for the machine to handle. This is what we mean by training.

It may be seen that the size of the Feature map is diﬀerent from that of the input image. This change is size

is determined by the kernel dimensions and is given by the formula.

Output size = (Input size - Kernel size + 2 * Padding) / Stride + 1.

(a) The (3x3) kernel

(b) An input image of size (5,5) with some arbitrary

values

as output

(d) The operation is done by sliding the kernel over the

image by a predeﬁned step size (here 1) and updating

the Feature Map.

Table 1.1: Source: https://medium.com/@rathna211994/convolution-neural-networks-cnn-b6fe90214b1e

Here, Padding and Stride are two new terms. The padding adds extra pixels (often zeros) around the input

data before applying convolution. It helps maintain the spatial dimensions of the feature map and can inﬂuence

whether the feature map is smaller or the same size as the input. The stride indicates how many pixels the kernel

moves in each step. A stride of 1 means the kernel shifts by 1 pixel, while a larger stride skips more pixels.

The machine initialises each kernel with random values during the ﬁrst training round. In subsequent

training times (epochs), the already determined kernel values are used for initialisation. The image is then

presented through the input layer (Figure 2), and the feature map is generated. To generate nonlinearity, an

activation function is applied to every map element. A typical activation function is the ReLu (Rectiﬁed Linear

Activation) that replaces every negative value with zero and the positive values unchanged.

The generated map is then subjected to a Pooling Layer, a downsampling procedure that reduces the spatial

dimension of the data while retaining essential features. This is important to reduce the computational load

and memory usage and improve the network’s generalisation ability. There are mainly two types of Pooling:

Max Pooling and Average Pooling. We deﬁne a window size like that of the kernel and then, in case of max

pooling, replace the corresponding pixel in the resulting Future map with the maximum value in the window or

the average values in case of Average Pooling. This can be implemented in Python as MaxPooling2D((n,n)),

where n represents the window size. Typically, n is set to 2.

The output of the Pooling layer goes to the subsequent Convolution and pooling layers. The complexity of

the data determines the number of subsequent such layers. A typical value is two. Since the layers are arranged

sequentially, this model is often called Sequential.

The output of the ﬁnal convolution layer is converted into a one-dimensional sequence of values by a

process called Flatten. This is to unwrap the 2D array by attaching each row to the end of the previous one. This

is required to build the ﬁnal layer of the model using a Fully Connected Feed Forward Network that can handle

only one-dimensional data. In structure, it is very similar to the ANN layer, with the capability to drop out

connections that do not signiﬁcantly contribute to the network’s performance. The network’s last layer usually

uses a softmax function as activation to combine the probability assigned by the network to each probable

outcome to estimate the most probable class of a given input image.

In Python, the entire CNN model can be represented conveniently using Keras as:

model = Sequential([

Conv2D(32, (3, 3), activation=’relu’, input\_shape=(32, 32, 3)),

MaxPooling2D((2, 2)),

Conv2D(64, (3, 3), activation=’relu’),

MaxPooling2D((2, 2)),

Conv2D(64, (3, 3), activation=’relu’),

Flatten(),

Dense(64, activation=’relu’),

Dropout(0.5),

Dense(10, activation=’softmax’)

])

We are now ready to wet our hands with a working CNN model on the popular CIFAR-10 dataset that

consists of 60,000 32x32 colour images in 10 diﬀerent classes. Each class contains 6,000 images. The dataset

is split into 50,000 training images and 10,000 testing images.

TO run the code below, you need to install tensorﬂow along with Python. You can use pip to install it.

The command is : “pip install tensorﬂow”. Now copy and paste the following code to a ﬁle and save it as

simple CNN.py. You can run it with the command “python simple CNN.py”. When you execute it, make sure

that you have a stable internet connection because you want to download the cifar-10 dataset.

The preprocessing step in the code rescales the pixel values to fal between 0 and 1. Usually images have

the value between 0 and 255 which is the reason why it is divided by 255. In case of images with a diﬀerent

value range, appropriate modiﬁcations should be made. It also creates the target variables to categorical since

our names are going to be alphanumeric.

#Import necessary libraries

import tensorflow as tf

from tensorflow.keras.datasets import cifar10

from tensorflow.keras.models import Sequential

from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout

from tensorflow.keras.utils import to\_categorical

# Load and preprocess the CIFAR-10 dataset

(x_train, y_train), (x_test, y_test) = cifar10.load_data()

x_train, x_test = x_train / 255.0, x_test / 255.0

y_train, y_test = to_categorical(y_train, num_classes=10), to_categorical(y_test, num_classes=10)

# Create the CNN model

model = Sequential([

Conv2D(32, (3, 3), activation=’relu’, input\_shape=(32, 32, 3)),

MaxPooling2D((2, 2)),

Conv2D(64, (3, 3), activation=’relu’),

MaxPooling2D((2, 2)),

Conv2D(64, (3, 3), activation=’relu’),

Flatten(),

Dense(64, activation=’relu’),

Dropout(0.5),

Dense(10, activation=’softmax’)

])

# Compile the model

model.compile(optimizer=’adam’, loss=’categorical_crossentropy’, metrics=[’accuracy’])

# Train the model

model.fit(x_train, y_train, epochs=10, batch_size=64, validation_data=(x_test, y\test))

# Evaluate the model on the test data

test_loss, test_acc = model.evaluate(x_test, y_test)

print(f{\textquotedbl}Test accuracy: {test_acc}{’dbl’})

# Make predictions on new data

predictions = model.predict(x_test[:10])

predicted_classes = [tf.argmax(prediction).numpy() for prediction in predictions]

print({textquotedbl}Predicted classes:{’dbl’}, predicted_classes)

About the Author

Professor Ninan Sajeeth Philip is a Visiting Professor at the Inter-University Centre for Astronomy

and Astrophysics (IUCAA), Pune. He is also an Adjunct Professor of AI in Applied Medical Sciences [BCMCH,

Thiruvalla] and a Senior Advisor for the Pune Knowledge Cluster (PKC). He is the Dean and Director of airis4D

and has a teaching experience of 33+ years in Physics. His area of specialisation is AI and ML.

About airis4D

Artiﬁcial Intelligence Research and Intelligent Systems (airis4D) is an AI and Bio-sciences Research Centre.

The Centre aims to create new knowledge in the ﬁeld of Space Science, Astronomy, Robotics, Agri Science,

Industry, and Biodiversity to bring Progress and Plenitude to the People and the Planet.

Vision

Humanity is in the 4th Industrial Revolution era, which operates on a cyber-physical production system. Cutting-

edge research and development in science and technology to create new knowledge and skills become the key to

the new world economy. Most of the resources for this goal can be harnessed by integrating biological systems

with intelligent computing systems oﬀered by AI. The future survival of humans, animals, and the ecosystem

depends on how eﬃciently the realities and resources are responsibly used for abundance and wellness. Artiﬁcial

intelligence Research and Intelligent Systems pursue this vision and look for the best actions that ensure an

abundant environment and ecosystem for the planet and the people.

Mission Statement

The 4D in airis4D represents the mission to Dream, Design, Develop, and Deploy Knowledge with the ﬁre of

commitment and dedication towards humanity and the ecosystem.

Dream

To promote the unlimited human potential to dream the impossible.

Design

To nurture the human capacity to articulate a dream and logically realise it.

Develop

To assist the talents to materialise a design into a product, a service, a knowledge that beneﬁts the community

and the planet.

Deploy

To realise and educate humanity that a knowledge that is not deployed makes no diﬀerence by its absence.

Campus

Situated in a lush green village campus in Thelliyoor, Kerala, India, airis4D was established under the auspicious

of SEED Foundation (Susthiratha, Environment, Education Development Foundation) a not-for-proﬁt company

for promoting Education, Research. Engineering, Biology, Development, etc.

The whole campus is powered by Solar power and has a rain harvesting facility to provide suﬃcient water supply

for up to three months of drought. The computing facility in the campus is accessible from anywhere through a

dedicated optical ﬁbre internet connectivity 24×7.

There is a freshwater stream that originates from the nearby hills and ﬂows through the middle of the campus.

The campus is a noted habitat for the biodiversity of tropical Fauna and Flora. airis4D carry out periodic and

systematic water quality and species diversity surveys in the region to ensure its richness. It is our pride that

the site has consistently been environment-friendly and rich in biodiversity. airis4D is also growing fruit plants

that can feed birds and provide water bodies to survive the drought.