But according to scientists:

Some birds use the Sun in order to know the right direction on basis of the sun's position during the day, and during the night they navigate by the position of the stars and the moon.

Some birds use the earth’s magnetic field

Few scientists assume birds contain deposits of magnetite at the base of their bill, and they use these like magnets that pull their noses north, which means they have an internal global positioning system.

But few scientists believe photons of light enter a bird's eye and galvanize intertwined electrons into chemical reactions that could create a map-like image of the magnetic field when multiplied across the eye.

Some birds use subsonic sound, the low-level noise created by ocean waves.

Some birds make their migrations in family groups, led always by an older bird that has made the flight before.

Some birds like seabirds can smell their way back Some birds use knowledge of the landscape, they follow coastlines, mountains, and river valleys.

Finally, however they navigate, they are much better than humans in terms of Navigation.

To know more read:

https://en.wikipedia.org/wiki/Animal_navigation

https://www.allaboutbirds.org/news/the-basics-migration-navigation/

Hope you learned something new here and don't forget to comment below your thoughts.

Thanks for reading!

Keep learning, Keep Growing.

Have a look at my previous blogs: Learnings from: IKIGAI - The Japanese Secret to a Long Happy Life

]]>Let's see where all this began?

Well, to start with, we have to know about the history of the Babylonian period. They derived their number system from the Sumerians who were using it as early as 3500 BC. They decided 24 hours are divided into two parts

- a day lasting 12 hours and a night lasting 12 hours,
- 1hour is 60 minutes and one minute is 60seconds.

But why??

Because they used the duodecimal system(base 12) and sexagesimal(base 60) which led to 12 hrs a day and 12 hrs a night.

Hipparchus gave us the “Equinoctial hours” by proposing the division of a day into 24 equal hours.

** '12' is a very special number **

The importance of the number 12 is typically attributed either to the fact that it equals the number of lunar cycles in a year or the number of finger joints on each hand (three in each of the four fingers, excluding the thumb), making it possible to count to 12 with the thumb.

The number of finger joints on each hand (excluding the thumb) makes it possible to count to 12 by using the thumb. So it can be said that the structure of our fingers may be the reason!

Another reason behind using '12' is that ‘12’ can be written as ‘6 X 2’, ‘3 X 4’. So that a day can be divided into half and quarters easily. whereas 10 has only three divisors - whole numbers that divide it a whole number of times.

Sixty has 12 divisors and because 60 = 5 x 12 it combines the advantages of both 10 and 12. It is notably convenient for expressing fractions since 60 is the smallest number divisible by the first six counting numbers(1,2,3,4,5,6) as well as by 10, 12, 15, 20, and 30.

Nobody really cared about seconds until after the Middle Ages and all the division of time was mainly to communicate time to others.

If someone asks you what time is it, you can say “half a day completed” or “quarter day completed”.

But imagine what it would be if a day was 10 hrs, it would hard to tell since 10/3 is 3.3333333….. Obviously, it would be difficult to speak. 60 can be divided by 6,5,4, 3, 2. So it would be easy to communicate the time.

Another reason might be with the calculation of degrees.

Longitude is measured by imaginary lines that run around the Earth vertically (up and down) and meet at the North and South Poles. These lines are known as meridians. Each meridian measures one arcdegree of longitude. The distance around the Earth measures 360 degrees. Each degree was divided into 60 parts, each of which was again subdivided into 60 smaller parts.

The first division, partes minutae primae, or first minute, became known simply as the "minute."

The second segmentation, partes minutae secundae, or "second minute," became known as the second.

So now division of time can be connected to the rotation of the earth. Earth rotates 1 degree in 4 minutes. This rotation is a full 360 degree.

So, in one hour, the earth turns 15 degrees. Every four minutes, the earth has turned one degree.
So in 60 mins, it would rotate 15 degrees. Now as we have 360 degrees in total to rotate, we take 24 hours to do so. (24×15 = 360).

Isn't this fun to know about this.

Hope you learned something new here and don't forget to comment below your thoughts. Thanks for reading!
Keep learning, Keep Growing.

We will look at metrics that helps in understanding similarities by measures of correlation.

- Pearson's Correlation Coefficient
- Spearman's Correlation Coefficient

Let's take a look at each of these individually.

**Pearson's correlation coefficient** is a measure related to the strength and direction of a **linear** relationship. The value for this coefficient will be between -1 and 1 where -1 indicates a strong, negative linear relationship and 1 indicates a strong, positive linear relationship.

If we have two vectors x and y, we can compare them in the following way to calculate Pearson's correlation coefficient:

where

$$\bar{x} = \frac{1}{n}\sum\limits_{i=1}^{n}x_i$$

or it can also be written as

$$CORR(x, y) = \frac{\text{COV}(x, y)}{\text{STDEV}(x)\text{ }\text{STDEV}(y)}$$

where

$$\text{STDEV}(x) = \sqrt{\frac{1}{n-1}\sum_{i=1}^{n}(x_i - \bar{x})^2}$$

and

$$\text{COV}(x, y) = \frac{1}{n-1}\sum_{i=1}^{n}(x_i - \bar{x})(y_i - \bar{y})$$

where n is the length of the vector, which must be the same for both x and y and $$\bar{x}$$ is the mean of the observations.

Function to get Pearson correlation coefficient.

```
def pearson_corr(x, y):
'''
Parameters
x - an array of matching length to array y
y - an array of matching length to array x
Return
corr - the pearson correlation coefficient for comparing x and y
'''
mean_x, mean_y = np.sum(x)/len(x), np.sum(y)/len(y)
x_diffs = x - mean_x
y_diffs = y - mean_y
corr_numerator = np.sum(x_diffs*y_diffs)
corr_denominator = np.sqrt(np.sum(x_diffs**2))*np.sqrt(np.sum(y_diffs**2))
corr = corr_numerator/corr_denominator
return corr
```

Spearman's correlation is non-parametric measure of rank correlation (statistical dependence between the rankings of two variables). It assesses how well the relationship between two variables can be described using a monotonic function.

The Spearman correlation between two variables is equal to the Pearson correlation between the rank values of those two variables; while Pearson's correlation assesses linear relationships, Spearman's correlation assesses monotonic relationships (whether linear or not).

You can quickly change from the raw data to the ranks using the **.rank()** method as shown code cell below.

If should map each of our data to ranked data values:

$$\textbf{x} \rightarrow \textbf{x}^{r}$$ $$\textbf{y} \rightarrow \textbf{y}^{r}$$

Here **r** indicate these are ranked values (this is not raising any value to the power of r). Then we compute Spearman's correlation coefficient as:

where

$$\bar{x}^r = \frac{1}{n}\sum\limits_{i=1}^{n}x^r_i$$

Function that takes in two vectors and returns the Spearman correlation coefficient.

```
def spearman_corr(x, y):
'''
Parameters
x - an array of matching length to array y
y - an array of matching length to array x
Return
corr - the spearman correlation coefficient for comparing x and y
'''
# Change each vector to ranked values
x = x.rank()
y = y.rank()
# Compute Mean Values
mean_x, mean_y = np.sum(x)/len(x), np.sum(y)/len(y)
x_diffs = x - mean_x
y_diffs = y - mean_y
corr_numerator = np.sum(x_diffs*y_diffs)
corr_denominator = np.sqrt(np.sum(x_diffs**2))*np.sqrt(np.sum(y_diffs**2))
corr = corr_numerator/corr_denominator
return corr
```

- Euclidean Distance
- Manhattan Distance

Euclidean distance is a measure of the straight line distance from one vector to another. In other words, euclidean distance is the square root of the sum of squared differences between corresponding elements of the two vectors. Since this is a measure of distance, larger values are an indicate two vectors are different from one another. The basis of many measures of similarity and dissimilarity is euclidean distance.

Euclidean distance is only appropriate for data measured on the same scale.
Consider two vectors **x** and **y**, we can compute Euclidean Distance as:

$$ EUC(\textbf{x}, \textbf{y}) = \sqrt{\sum\limits_{i=1}^{n}(x_i - y_i)^2}$$

Function to Euclidean Distance. (I have taken help from numpy)

```
def eucl_dist(x, y):
'''
Parameters
x - an array of matching length to array y
y - an array of matching length to array x
Return
euc - the euclidean distance between x and y
'''
return np.linalg.norm(x - y)
```

Different from euclidean distance, Manhattan distance is a 'manhattan block' distance from one vector to another. Therefore, imagine this distance as a way to compute the distance between two points when you are not able to go through buildings or blocks.

Specifically, this distance is computed as:

$$ MANHATTAN(\textbf{x}, \textbf{y}) = \sqrt{\sum\limits_{i=1}^{n}|x_i - y_i|}$$

Function to calculate Manhattan distance

```
def manhat_dist(x, y):
'''
INPUT
x - an array of matching length to array y
y - an array of matching length to array x
OUTPUT
manhat - the manhattan distance between x and y
'''
return sum(abs(e - s) for s, e in zip(x, y))
```

In the above image, the **blue** line gives the **Manhattan** distance, while the **green** line gives the **Euclidean** distance between two points.

Here in finding similarity by measure of distance, no scaling is performed in the denominator. Therefore, you need to make sure all of your data are on the **same scale** when using this metric.

Because measuring similarity is often based on looking at the distance between vectors, it is important in these cases to scale your data or to have all data be in the same scale.

It becomes a problem if some measures are on a 5 point scale, while others are on a 100 point scale, and we are most likely to have non-optimal results due to the difference in variability of features.

Let's get started.

The **confusion matrix** is the table that is used to describe the performance of the model.

We can use accuracy as a metric to analyze the performance of the model, then why confusion matrix???

What is the need for Confusion matrix???

So to understand this let us consider an **example** of the cancer prediction model.

Since this is a binary classification model its job is to detect cancerous patients based on some features. Considering that only a few, get cancer out of millions of population we consider only 1% of the data provided has cancer positive.**Having cancer is labeled as 1 and not cancer labeled as 0,**

An interesting thing to note here is if a system gives the prediction as all 0’s, even then the prediction accuracy will be 99%. It is similar to writing print(0) in model output. This will have an accuracy of 99%.

But this is not correct right??

Now that you know what is the problem and the need for a new metric to help in this situation, let us see how the confusion matrix solves this problem.

Let us consider an example with a classification dataset having 1000 data points.

We get the below confusion matrix:

There will be two classes 1 and 0.

1 would mean the person has cancer, and 0 would mean they don't have cancer.

By seeing this table we have 4 different combinations of predicted and actual values. Let us consider predicted values as Positive and Negative and actual values as True and False.

Just hold on,,, this is easy and you will understand....

**True Positive**:

Interpretation: Model predicted positive and it’s true.

Example understanding: The model predicted that a person has cancer and a person actually has it.

**True Negative**:

Interpretation: Model predicted negative and it’s true.

Example understanding: The model predicted that a person does not have cancer and he actually doesn't have cancer.

**False Positive**:

Interpretation: Model predicted positive and it’s false.

Example understanding: The model predicted that a person has cancer but he actually doesn't have cancer.

**False Negative**:

Interpretation: Model predicted negative and it’s false.

Example understanding: The model predicted that a person does not have cancer and person actually has cancer.

**Precision**:

Out of all the positive classes we have predicted correctly, how many are actually positive.

**Recall**:

Out of all the positive classes, how much we predicted correctly.

Image credit: Wikipedia

Calculating precision and recall for the above table.

Let's compare this with accuracy.

The model got an accuracy of 96% but the precision of 0.5 and recall of 0.75

which means that 50% percent of the correctly predicted cases turned out to be cancer cases. Whereas 75% of the cancer positives were successfully predicted by our model.

Consider an example where prediction is replaced by print(0). so that we get 0 every time.

Actual y=0 | Actual y=1 | |

Predicted y = 0 | 914 | 86 |

Predicted y = 1 | 0 | 0 |

Here accuracy will be 91.4% but what happens to Precision and recall??

Precision becomes 0 since TP is 0.

Recall becomes 0 since TP is 0.

This is a classic example to understand Precision and Recall.

So now you understand why accuracy is not so useful for imbalanced dataset and how Precision and Recall plays a key role.

One important thing is to understand is, when to use Precision and when to use Recall??

Precision is a useful metric where False Positive is of greater importance than False Negatives.

For example, In recommendation systems like Youtube, Google this is an important metric, where the wrong recommendations may cause users to leave the platform.

Recall is a useful metric where False Negative is of greater importance than False Positive. For example, In the medical field, detecting patients without a disease positive can be tolerated to an extent but patients with disease should always be predicted.

So what is the case when we are not sure whether to use Precision or Recall??

or what to do when one of the two is high, whether the model is good???

To answer this let us see what is F1 score.

**F1-score** is a harmonic mean of Precision and Recall, and it is a high value when Precision is equal to Recall.

**Why not normal arithmetic mean instead of harmonic mean??**

Because arithmetic mean gives high value when one of the two is high value but harmonic mean will be high only when both are almost equal.

So from our example, the F1 score becomes

F1 = 2TP / (2TP + FP + FN) = 2*30 / (2*30 + 30 + 10) = 0.6

I believe after reading this Confusion matrix is not so confusing anymore!

Hope you learned something new here and don't forget to comment below your thoughts. Thanks for reading!

Keep learning, Keep Growing.

We respond in a particular way to a particular stimulus.

But in between stimulus and response, We have the freedom to choose how we respond.

And based on our freedom of choice to respond, we either become reactive or proactive.

Reactive people are affected by physical, social, or psychological environment and response to the stimulus is based on their surrounding environment. Response keeps changing based on the change in the environment.

Proactive people are also influenced by external stimuli, but their response to stimuli is a value-based choice.

How do we shift our attitude from Reactive to Proactive???

For this paradigm shift to happen from Reactive to Proactive we should start by taking Initiative.

We should make a conscious effort to change. Many people wait for something to happen or someone to help them, but in reality, we should help ourselves and instead of being problem ourselves we should start finding solutions to problems.

And that comes with taking responsibility to act. Just Planning is not enough, Taking that first step is so much important.

When facing any critical situation we can ask these 3 questions to find a solution to Problem.

- What's happening to me? What is the stimulus?
- What's going to happen in the future based on the decision I make?
- What is my response? What can I do? What initiative can I take in this situation?

Our initiative to respond to stimulus determines our degree of Proactivity.

As many people say `be a positive person`

and `have positive thinking in life`

. But that's just not enough.

Thinking positively without understanding reality may cause danger. And that's exactly the difference between Positive thinking and Proactivity.

The stimulus may be Good or bad, but response towards that stimulus is our own choice, which may make the situation better or worse and just(mostly) depends on the choice we make.

Just thinking Positive is not the solution, We should face reality. We should consider the current and future impact and still find the power to choose a positive response to that stimulus.

Using correct language also has an impact on our Proactive behavior. For example instead of "I have to" start using "I choose to"

To become Proactive we should start focusing more of our time and energy on the circle of Influence. These are things within our Circle of Concern on which we can do something about. This also determines our degree of Proactivity.

Try to work on the Circle of Influence by making small commitments and work on them. Don't be in Blaming, accusing mode, rather work on things that have you have control over. Work on You.

Concentrate on the Freedom of Choice you choose for responding to stimuli.

You can read my other Blogs:

- Why do I write Blogs
- Series of blogs based on my Learnings from IKIGAI : the Japanese secret to a long happy life
- Invention vs Innovation
- www of new years resolution

Hope you learned something new here and don't forget to comment below your thoughts.
Thanks for reading!

Keep learning, Keep Growing.

I became interested and curious about applications of machine learning during my Engineering life and wanted to learn Machine learning. I got started with courses from Coursera. Most courses concentrated on Theory, though that is necessary I liked to learn building applications. I was looking for a course which teaches by building what we learn. And luckily at that time, I found about the fastai course from my friend.

Yes, It is a course, and the course teaches about the fastai library which is an open-source project. And the course is also free.

The course follows a top-down approach and Jeremy makes it very interesting. 80% of what I learned on Machine learning and deep learning comes from fastai.

If you didn't hear about fastai, please please do visit fast.ai.

My blogging journey also started because of fastai. I was inspired by Rachel Thomas blog, Why you (yes, you) should blog

Here you can read Why do I write blogs?

Thanks a lot to Jeremy Howard and Rachel Thomas for inspiring and helping me to start my journey in deep learning. And thanks a lot to everyone who contributed to this amazing open-source project, fastai.

]]>The main reason for me to blog, is to write about things I learned and difficulties I faced while learning something new so that if a beginner is starting on the same journey, they might quickly understand beginner stuff and quickly move to intermediate level.

We usually learn complicated concepts with hard work and dedication but as we work on it for few days, those concepts become easy for us, and at that time if a beginner asks us about that concept we explain with lot jargons and might not be able to explain in a beginner-friendly way.

so I feel the solution for this to try to write a blog about what I learn so that it might help beginners and even for me to refer in the future. So basically for Beginner friendly documentation.

I believe I learn more when I am able to teach a concept to someone in simple words. Teaching to others also improves communication skills.

If I am able to explain a concept to a person without much jargon and in a simple way, means that I have learned that concept correctly. So second reason is to teach what I have learned, by which I learn in more depth. One of the best ways to test whether I understood something is to explain it to someone else.

I want to develop the habit of learning a new concept in Machine learning every week. And when I learn something I will write a blog on it.

So indirectly, excitement and maintaining consistency of publishing blog every week will keep me motivated to learn a new concept in ML every week.

Making a habit of writing a blog with consistency is difficult. But once this becomes a habit it is much better than striving for motivation. Building a system of habit is much better than motivation.

I want to expand my network and connect with like-minded people and learn from them. As I expand my network, I will get more feedbacks for the blogs I write.

Feedbacks play a vital role for a writer and helps to improve. Feedbacks and suggestions always improve writing, and may sometimes be an eye-opener.

Positive feedbacks motivate me to write more, whereas, negative feedback helps me grow and get better.

Blogging increased my confidence. I don't know why, but I am now much confident to post a blog compared to posting my first blog. I still remember I did not published and post my blog on social handles for 7 days after writing it, and used to check the entire content to see if everything is correct. Writing blogs has increased my confidence, both in terms of depth of understanding a concept and expressing my voice in social media.

Coming to Monetization,**I don't want to monetize my blog. Actually, I don't feel like reading blog which contains ads around them, so I don't want to give that trouble to the readers.**
Maybe in future, if there is a way, by which I can monetize the blog without disturbing readers than surely I may try that option.

I want to cover as much as the topic I can in Data science which is necessary for beginners to start the journey in data science. I also want to write about python and how to effectively use it for competitive programming.

P.S : I have learned a lot about deep learning from fastai, and was inspired by Racheal Thomas's blog Why you should blog?

If you have not read it, please do read it.

Thanks for your time and reading!

See you until the next article!

]]>Couchbase Analytics service comes with enterprise addition.

Couchbase Analytics is a parallel data management capability for Couchbase Server which is designed to efficiently run large ad hoc join, set, aggregation, and grouping operations over many records.

**Why analytics?**

Every business does these three things in a cycle or a spiral [The Goal].

- Run the business process to deliver products or services to the customers.
- Analyze the business to determine what to change and what to change to.
- Make the change happen.

The Query Service is used by the applications needed to run the the business; it is designed for large number of concurrent queries, each doing small amount of work. In the RDBMS world, this workload is called the OLTP workload.

Applications or tools used for analysis have different workload characteristics. These typically use the Analytics Service; it is designed for a smaller number of concurrent queries analyzing a larger number of documents. In the RDBMS world, this workload is called the OLAP workload(Online analytical processing)

Advantages Couchbase Analytics approach/ WHY Couchbase Analytics?

**Common data model**: we don’t have to force our data into a flat, predefined, relational model to analyze it**Workload isolation**: Operational query latency and throughput are protected from slow-downs due to your analytical query workload**High data freshness**: Analytics runs on data that’s extremely current, without ETL or delays.
**Continuous data sync****Easy to manage SQL++ interface**

Reduce infrastructure complexity, application development complexity, and cost with a single system for operations and analytics.

Analytics Service should be run alone, on its own cluster node, with no other Couchbase Service running on that node Due to the large scale and duration of operations it is likely to perform.

The minimum memory(RAM) quota required is 1024 MB for analytics service.

Analytics queries never touch Couchbase data servers, but instead running on real-time shadow copies of our data in parallel. Because of this, there is no worry about slowing down the Couchbase Server nodes with complex queries.

The top-level organizing concept in the Analytics data world is the dataverse.

A **dataverse**, is a namespace that gives you a place to create and manage datasets and other artifacts for a given Analytics application.

In that respect, a dataverse is similar to a database or a schema(schema is the skeleton structure that represents the logical view of the entire database. It defines how the data is organized and how the relations among them are associated) in a relational DBMS.

Datasets are containers that hold collections of JSON objects. They are similar to tables in an RDBMS or keyspaces in N1QL. A dataset is linked to a Couchbase bucket so that the dataset can ingest data from Couchbase Server.

Fresh Analytics instance starts out with two dataverses, one called Metadata (the system catalog) and one called Default (available for holding data).

The first task is to tell Analytics about the Couchbase Server data that you want to shadow and the datasets where you want it to live.

```
CREATE DATASET datasetName ON `bucketName`;
```

If the bucket has data that are of different types or categories, we can create a separate dataset for those using `WHERE`

and it will be created.

```
CONNECT LINK Local;
```

Local = local server(Data service in the same cluster)

Now perform basic query from analytics service to check if everything is working fine.

```
SELECT meta(k) AS meta, k AS data
FROM datasetName k
LIMIT 1;
```

As in SQL, the query’s FROM clause binds the variable `k`

incrementally to the data instances residing in the dataset named `datasetName`

. The SELECT clause returns all of the meta-information plus the data value for each binding that satisfies the predicate.

Once this is set up you can access this service from Couchbase java SDK too.

Follow Establishing connection to Couchbase server using Couchbase Java SDK to set up Java SDK.

After following that blog edit App.java code with following code to use Couchbase analytics.

```
import com.couchbase.client.java.*;
import com.couchbase.client.java.kv.*;
import com.couchbase.client.java.json.*;
import com.couchbase.client.java.query.*;
import com.couchbase.client.core.error.CouchbaseException;
import com.couchbase.client.java.analytics.*;
public class App {
public static void main(String[] args) {
try {
Cluster cluster = Cluster.connect("localhost", "username", "password");
final AnalyticsResult result = cluster
.analyticsQuery("SELECT * FROM datasetName LIMIT 2;");
for (JsonObject row : result.rowsAsObject()) {
System.out.println("Found row: " + row);
System.out.println();
}
} catch (CouchbaseException ex) {
ex.printStackTrace();
}
}
}
```

Credits: Couchbase Documentation

]]>This blog is about how I created maven project in Ubuntu ec2 instance from terminal and established connection to Couchbase server to fetch data from bucket.

Maven is a Java tool, so you must have Java installed in order to proceed.

First, download Maven and follow the installation instructions .
I am using ubuntu in ec2 so, your steps may change if your using windows or mac.

```
mvn --version
```

Once confirmed on maven is installed, move to creating maven project.

Create a folder and start shell there, and execute the following Maven command.

```
mvn archetype:generate -DgroupId=com.couchbase.client -DartifactId=couchbaseanalytics -DarchetypeArtifactId=maven-archetype-quickstart -DinteractiveMode=false
```

Here ‘DartifactId‘ is your project name and ‘DarchetypeArtifactId‘ is the type of Maven project.

Once you press Enter after typing the above command, it will start creating the Maven project. If you have just installed Maven, it may take a while on the first run. This is because Maven is downloading the most recent artifacts (plugin jars and other files) into your repository. You may also need to execute the command a couple of times before it succeeds. This is because the remote server may time out before your downloads are complete.

**ERROR**: In case of build failure, please check the Maven version number in the pom.xml file. It should match the version of Maven installed on your machine.

Once that the command is executed that it created a directory with the same name given as the artifactId. Change into that directory.

`cd couchbaseanalytics`

Under this directory you will notice the following standard project structure. If this is not, then you might have deviated somewhere.

```
couchbaseanalytics
|-- pom.xml
`-- src
|-- main
| `-- java
| `-- com
| `-- couchbase
| `-- client
| `-- App.java
`-- test
`-- java
`-- com
`-- couchbase
`-- client
`-- AppTest.java
```

The src/main/java directory should contain the project source code, and the pom.xml file is the project's Project Object Model, or POM.

The pom.xml file is the core of a project's configuration in Maven. It is a single configuration file that contains the majority of information required to build a project in just the way you want.

Open pom file and try to explore different fields.

Now run below command to build the project.

```
mvn package
```

Once you get BUILD SUCCESS , You may test the newly compiled and packaged JAR with the following command:

```
java -cp target/couchbaseanalytics-1.0-SNAPSHOT.jar com.couchbase.client.App
```

This command will print :

`Hello World!`

Now add below to pom.xml file inside `<dependencies>`

:

```
<dependency>
<groupId>com.couchbase.client</groupId>
<artifactId>java-client</artifactId>
<version>3.1.3</version>
</dependency>
```

once that is done, edit java App.java file:

you can open using vi, I used following command instead of cd directory:

```
vi /home/ubuntu/kiran/couchbaseanalytics/src/main/java/com/couchbase/client/App.java
```

Note: your path may be different so change it respectively.

Add below code inside main class

`Cluster cluster = Cluster.connect(connectionString, username, password);`

connectionString is "localhost"

The Cluster provides access to cluster-level operations like N1Ql queries, analytics or full-text search.

following imports are necessary to build the snippets:

```
import com.couchbase.client.java.*;
import com.couchbase.client.java.kv.*;
import com.couchbase.client.java.json.*;
import com.couchbase.client.java.query.*;
```

To access the KV (Key/Value) API or to query views, you need to open a Bucket:

```
// get a bucket reference
Bucket bucket = cluster.bucket(bucketName);
```

perform a N1QL query at the cluster level:

```
QueryResult result = cluster.query("select * from `beer-sample` limit 10");
System.out.println(result.rowsAsObject());
```

Note: I had bucket `beer-sample`

you can replace that with bucket name you have.

Now run:

`mvn clean install`

```
mvn compile exec:java -Dexec.mainClass="com.couchbase.client.App" -Dexec.arguments="Hello World,Bye"
```

you should see records from bucket in terminal.

```
import com.couchbase.client.java.*;
import com.couchbase.client.java.kv.*;
import com.couchbase.client.java.json.*;
import com.couchbase.client.java.query.*;
public class App {
public static void main(String[] args) {
Cluster cluster = Cluster.connect(connectionString, username, password);
Bucket bucket = cluster.bucket(bucketName);
QueryResult result = cluster.query("select * from `beer-sample` limit 10");
System.out.println(result.rowsAsObject());
}
```

Hurray!!! That's it. Now you have established connection with Couchbase server. You can edit code in App.java to adapt to your use case.

]]>I am writing this Blog to report my finding for the Starbuck's Capstone Project of the Udacity Data Science Nanodegree.

Starbucks is providing data on their promotional offers to be analyzed, the data set contains simulated data that mimics customer behavior on the Starbucks rewards mobile app.

You can here find GitHub repository of project and Detailed explanation of code in Jupyter Notebook file of project.

In this analysis, I want to try to reach to Solution with Data I have. Data -> Question -> Solution.

Can we classify if an offer is going to be successful based on demographic and offer information?

- My approach would be to Build a machine learning model to predict offer success based on the demographic information and the offer details that are provided in the data.

Which offer is the most successful?

- Who spends more money? male or female?

For 2 questions, I am planning to plot a graph with required entities to get insight.

The dataset is contained in three files:

- portfolio.json - containing offer ids and meta data about each offer (duration, type, etc.)
- profile.json - demographic data for each customer
- transcript.json - records for transactions, offers received, offers viewed, and offers completed.

I am following below steps for data cleaning in profile, portfolio and transcript dataset.

- Check how many data entries are missing values and find any correlation for missing values.
- It is interesting to see in profile data that for every gender with value None, the income is also unknown. Also when both are unknown, age is set to 118 for all the rows with NaN values. And missing values is roughly 12.8%, as this is less percentage, my strategy is to drop those for the machine learning model. Thus, I am not considering for implementing an imputation strategy here.

- Create dummies for categorical data type columns
- Create a datetime format for required columns.

I am adding a new dataframe called moneyspent from transcript and profile data, merging using customer Id to see behavior of money spent based on gender.

To understand this, plot graph based on gender and amount spent.

From this graph it can be clearly analyzed that women spend more money in general.

Plot a graph to see distribution/ of each event like offer received, viewed, completed over offer types like bogo, discount, informational.

A lot of people receive offer, out of which many people view the offer, and only a few people to actual transaction to get the offer. Since sending offers is cost to company, sending offers to right set of people is important and can attract more people doing transaction.

Just data cleaning is not enough, there could be few conditions where new data needs to be prepared based on existing data to get more insights about the data.
There is a specific case in dataset because of which this step is necessary.

Offer can be completed and benefit can be availed without actually viewing the offer. So these customers would any ways made a purchase so identifying these customers is necessary.

I will have a separate column to define which outcomes in the data can be classified as success cases.

**BOGO and Discount Offers**
In general, possible event paths for BOGO and discount offers are:

**successful offer**: offer received → offer viewed → transaction(s) → offer completed**ineffective offer**: offer received → offer viewed**unviewed offer**: offer received**unviewed success**: offer received → transaction(s) → offer completed

While successful offer is the path that reflects that an offer was successfully completed after viewing it that is what we want(desirable outcome)

both ineffective offer and unviewed offer reflect that the offer was not successful, i.e., did not lead to transactions by the customer (or to insufficient transactions).This can be treated as failure of offer.

However, it is important to keep in mind that there can also be unviewed success cases, meaning that the customer has not viewed the offer, but completed it anyways, i.e., that the customer made transactions regardless of the offer.
Thus it is very much important to separate unviewed success from successful offer, so that targeting right customers can be achieved.

Ideally, Starbucks might want to target those customer groups that exhibit event path 1,
while not targeting those that are most likely to follow paths 2 and 3 because those are not going to make transactions

or 4 because those make transactions anyways, so Starbucks would actually lose money by giving them discounts or BOGO offers.

So after this step, I will have a column called `success`

which will have value 1 if it is a successful offer, and 0 otherwise.

Now I will merge all the data into one dataframe called `full_data`

to give to Machine learning classification model.

I am defining a function called `Classification_model`

with Randomforest model as default classifier.**Metric** used are accuracy_score, roc_auc_score.

For Random forest model I am getting following result.

```
precision recall f1-score support
failure 0.70 0.70 0.70 10390
success 0.67 0.67 0.67 9561
avg / total 0.69 0.69 0.69 19951
Overall model accuracy: 0.686331512204902
Train ROC AUC score: 0.9863216113589416
Test ROC AUC score: 0.7465330713208808
```

And for AdaBoost classifier I am getting below result.

```
precision recall f1-score support
failure 0.68 0.68 0.68 10390
success 0.66 0.66 0.66 9561
avg / total 0.67 0.67 0.67 19951
Overall model accuracy: 0.6710941807428199
Train ROC AUC score: 0.7295966493333714
Test ROC AUC score: 0.7299133701950669
```

Out of which I feel AdaBoostClassifier is performing better without offerfitting, so

the AdaBoost classification model can be used to predict whether an offer is going to be successfully completed based on customer and offer characteristics.

I am performing GridSearchCV on AdaBoostClassifier on below parameters:

```
param_grid={'n_estimators': [50, 100, 500],
'learning_rate': [0.1,0.7, 1],
'algorithm': ['SAMME.R','SAMME']}
```

and I got below parameters as best parameters.

```
{
'algorithm': 'SAMME.R',
'learning_rate': 0.1,
'n_estimators': 500
}
```

After applying these parameters, I got below result:

```
precision recall f1-score support
failure 0.68 0.68 0.68 10390
success 0.65 0.66 0.65 9561
avg / total 0.67 0.67 0.67 19951
Overall model accuracy: 0.6689890231066112
Train ROC AUC score: 0.7295248849870593
Test ROC AUC score: 0.7302833968483007
```

The optimal model found via gridsearchcv does not perform better than the initial model using mostly default settings. So I using overall_model that is Adaboost classifier with default settings for further process.

The final model I consider has an accuracy of 67%, which is a decent number, although there is certainly room for improvement.

Based on model I am plotting Feature importance of data used for model.

The most relevant factors for offer success based on the model are:

- Membership duration
- Income
- Age
- offer Duration

To understand this I am plotting graph for Comparing the performance of BOGO and discount offer.

By looking at the graph it can be said that **Discount offer is more successful** because

not only the absolute number of 'offer completed' is slightly higher than BOGO offer, its overall completed/received rate is also about 7% higher. However, BOGO offer has a much greater chance to be viewed or seen by customers. But turning offer received to offer completed can be done by discount offer than bogo offer.

we can also check for offers that led to successful offer meaning it took following path offer received → offer viewed → transaction(s) → offer completed.

This also shows that discount offer is better performing than Bogo offer.

- Explore better modeling techniques and algorithms to see whether model performance can be improved in this way.
- Do not drop the observations with missing values, but use some kind of imputation strategy to see whether the model can be improved this way.

Then you should read this blog.

Once you have done enough of modeling and crossed the barrier of beginner, you will find yourself doing the same few steps over and over again in the same anaysis. You need some tool to automate the same repeating steps.

And guess what?

You have tool in Python scikit-learn, Pipelines that help to to clearly define and automate these workflows.

Pipelines allows linear sequence of data transforms to be chained together.

Scikit-learn's pipeline class is a useful tool for encapsulating multiple different transformers alongside an estimator into one object, so that you only have to call your important methods once ( fit() , predict() , etc).

For better understanding let us consider simple example of a machine learning workflow where we generate features from text data using count vectorizer and tf-idf transformer, and then fit it to a random forest classifier.

But to understand How much Pipeline help in the project we should do same process without using Pipeline and then compare with using it.

**Without pipeline**:

```
vect = CountVectorizer()
tfidf = TfidfTransformer()
clf = RandomForestClassifier()
# train classifier
X_train_vect = vect.fit_transform(X_train)
X_train_tfidf = tfidf.fit_transform(X_train_vect)
clf.fit(X_train_tfidf, y_train)
# predict on test data
X_test_vect = vect.transform(X_test)
X_test_tfidf = tfidf.transform(X_test_vect)
y_pred = clf.predict(X_test_tfidf)
```

What are CountVectorizer() and TfidfTransformer() ?

what is RandomForestClassifier()?

What are transformers and estimator which we saw in Pipeline defination?

**TRANSFORMER**: A transformer is a specific type of estimator that has a fit method to learn from training data, and then a transform method to apply a transformation model to new data. These transformations can include cleaning, reducing, expanding, or generating features.

In the example above, CountVectorizer and TfidfTransformer are transformers.

Thats why we used `vect.fit_transform`

.

**ESTIMATOR**: An estimator is any object that learns from data and extracts or filters useful features from raw data. Since estimators learn from data, they each must have a fit method that takes a dataset.
In the example RandomForestClassifier is estimators, and have a fit method.

**PREDICTOR**: A predictor is a specific type of estimator that has a predict method to predict on test data based on a supervised learning algorithm, and has a fit method to train the model on training data. The final estimator, RandomForestClassifier, in the example is a predictor.

Fortunately, we can automate all of this fitting, transforming, and predicting, by chaining these estimators together into a single estimator object. That single estimator would be scikit-learn's Pipeline.

To create this pipeline, we just need a list of (key, value) pairs, where the key is a string containing what you want to name the step, and the value is the estimator object.

**WIth using pipeline**:

```
pipeline = Pipeline([
('vect', CountVectorizer()),
('tfidf', TfidfTransformer()),
('clf', RandomForestClassifier()),
])
# train classifier
pipeline.fit(Xtrain)
# evaluate all steps on test set
predicted = pipeline.predict(Xtest)
```

Now with Pipeline when we use fit() on training data, we would get same result we got in previous example without pipeline. This makes code shorter, simpler .

But do build pipeline each step has to be a transformer, except for the last step, which can be of an estimator type. In our example since the final estimator of our pipeline is a classifier, the pipeline object can be used as a classifier, taking on the fit and predict methods of its last step. Alternatively, if the last estimator was a transformer, then pipeline would be a transformer.

Isn't this cool?

Pipeline makes our code Simple and convenient.

Chaining all of your steps into one estimator allows you to fit and predict on all steps of your sequence automatically with one call. Pipeline will handles smaller steps and we need to focus on implementing higher level changes which will help us to easily understand the workflow.

Using Pipeline, all transformations for data preparation and feature extractions occur within each fold of the cross validation process. This prevents common mistakes like training process to be influenced by your test data.

So instead of repeating same steps, lets use Pipeline.

For more information on Pipeline:

]]>If you are using kaggle then you can find the data here.

This blog is based on my notebook on kaggle. You can click here for code.

Github repo of project

In this analysis, I want to concentrate/ask 4 Questions, and try to reach to Solution with Data I have.

Data -> Question -> Solution.

Questions?

- Is Formal Education necessary to become Professional Developer?
- As a Software engineer, Is it better to work in India or move to Western countries?
- Which country has the most number of developers in last 4 years? and where does India stand in terms of total number of developers?
- Will you earn more salary if you contribute to open source?

We have 2 datasets for every year.

Consider for example year 2020,

df_2020: It consist entire dataset
df_2020_Schema: It consist of column name from df_2020 and question asked for that column to fill the details in survey

Let us look at what we understand from data for each question in the respective section.

To make this analysis, required column are present only in 2020 dataset, so I am considering only 2020 dataset for this question.

Plotting barplot using column 'NEWEdImpt' which tells importance of formal education, and checking count for each answer.

From above plot it can be understood that

- Almost 85% of the respondents that are professional developers feel that formal education is at least somewhat important,
- which is contrary to the popular idiom that you don't need formal education to become a developer.
- However, almost 16% believe that it is not at all important or necessary.

To get to a fare point, we will compare Formal Education importance with Salary they earn. And also we don't want to manipulate our decision with everyone's opinion so I am considering only Professional developer.

For handling with missing values I am droping all the rows which doesnt have NEWEdImpt and Salary.

Two reason for dropping

- I can afford to do this because I have huge data left after this
- Imputing data for salary may manipulate the opinion about formal education.

- This means that Professional developers with different opinion about formal education earn pretty much the same.
- Professional developers who think Formal education is not needed earn comparatively higher then professional developers feel that formal education is important, but they are only 16%.
- Remaining 85% of the respondents that are professional developers feel that formal education is at least somewhat important and earn comparatively same.
- This doesn't mean people cannot become professional developers and earn competitive salary, certainly is possible but chances are just 16%
- This plot was to see if there is a huge difference in salary of Professional developer with and without formal education.

Since there is no huge difference, going with majority I conclude that to become Professional developer and earn competitive salary it is better to complete formal education.

To make this analysis of this question, required column are present only in 2017 and 2019 dataset, so I am considering 2017,2019 dataset for this question.

I am considering only employed full-time professional developer.

Here you can see a programmer’s salary in the India is much lower than that in the west no matter how many years you are coding.

Since CareerSatisfaction and JobSatisfaction are only present in 2017, I am making analysis of this based on only 2017 data

After comparing the salary based on Years a programmer has been coded between the India and the western world.

- I found that the salary of an indian programmer is much lower than that of the western no matter how many years they are coded.
Career and Job Satisfaction of the western programmers are much higher than those of the indian programmers.

So if you want to earn good amount of salary with Career and Job Satisfaction as you grow experience it is better to move to Western countries.

Can you guess where do you find most number of developers in the world from year 2017 to 2020?

We write a function to get a tuple with details of top 2 countries details

Plot top 2 country with most number of developers.

From above plot it could be understood that

- According to Stackoverflow, most of the developers on its platform are from United States for each year. US contributes to around 20 percent of developers in the world.
- And India has second most number of the developers in the world according to Stackoverflow. India contributes to around 10 percent of developers in the world

To make this analysis of this question, required column are present only in 2019 dataset, so I am considering 2019 dataset for this question.

As we can see that the more you contribute to open source, Total salary earned is also increasing.

So it is good idea to contribute to open source.

Seaborn is a library for making statistical graphics in Python. It builds on top of matplotlib and integrates closely with pandas data structures.

Seaborn helps you explore and understand your data. Its plotting functions operate on dataframes and arrays containing whole datasets and internally perform the necessary semantic mapping and statistical aggregation to produce informative plots. Its dataset-oriented, declarative API lets you focus on what the different elements of your plots mean, rather than on the details of how to draw them.

For more examples see this https://seaborn.pydata.org/examples/index.html

```
# import libraries
import pandas as pd # Import Pandas for data manipulation using dataframes
import numpy as np # Import Numpy for data statistical analysis
import matplotlib.pyplot as plt # Import matplotlib for data visualisation
import seaborn as sns # Statistical data visualization
```

To learn more about seaborn, let us learn with examples. Consider breast cancer dataset.

```
# Import Cancer data drom the Sklearn library
from sklearn.datasets import load_breast_cancer
cancer = load_breast_cancer()
```

```
cancer['feature_names']
```

```
array(['mean radius', 'mean texture', 'mean perimeter', 'mean area',
'mean smoothness', 'mean compactness', 'mean concavity',
'mean concave points', 'mean symmetry', 'mean fractal dimension',
'radius error', 'texture error', 'perimeter error', 'area error',
'smoothness error', 'compactness error', 'concavity error',
'concave points error', 'symmetry error',
'fractal dimension error', 'worst radius', 'worst texture',
'worst perimeter', 'worst area', 'worst smoothness',
'worst compactness', 'worst concavity', 'worst concave points',
'worst symmetry', 'worst fractal dimension'], dtype='<U23')
```

Create a dataFrame named df_cancer with input/output data

```
df_cancer = pd.DataFrame(np.c_[cancer['data'], cancer['target']], columns = np.append(cancer['feature_names'], ['target']))
```

```
# head of the dataframe
df_cancer.head()
```

mean radius | mean texture | mean perimeter | mean area | mean smoothness | mean compactness | mean concavity | mean concave points | mean symmetry | mean fractal dimension | radius error | texture error | perimeter error | area error | smoothness error | compactness error | concavity error | concave points error | symmetry error | fractal dimension error | worst radius | worst texture | worst perimeter | worst area | worst smoothness | worst compactness | worst concavity | worst concave points | worst symmetry | worst fractal dimension | target | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

0 | 17.99 | 10.38 | 122.80 | 1001.0 | 0.11840 | 0.27760 | 0.3001 | 0.14710 | 0.2419 | 0.07871 | 1.0950 | 0.9053 | 8.589 | 153.40 | 0.006399 | 0.04904 | 0.05373 | 0.01587 | 0.03003 | 0.006193 | 25.38 | 17.33 | 184.60 | 2019.0 | 0.1622 | 0.6656 | 0.7119 | 0.2654 | 0.4601 | 0.11890 | 0.0 |

1 | 20.57 | 17.77 | 132.90 | 1326.0 | 0.08474 | 0.07864 | 0.0869 | 0.07017 | 0.1812 | 0.05667 | 0.5435 | 0.7339 | 3.398 | 74.08 | 0.005225 | 0.01308 | 0.01860 | 0.01340 | 0.01389 | 0.003532 | 24.99 | 23.41 | 158.80 | 1956.0 | 0.1238 | 0.1866 | 0.2416 | 0.1860 | 0.2750 | 0.08902 | 0.0 |

2 | 19.69 | 21.25 | 130.00 | 1203.0 | 0.10960 | 0.15990 | 0.1974 | 0.12790 | 0.2069 | 0.05999 | 0.7456 | 0.7869 | 4.585 | 94.03 | 0.006150 | 0.04006 | 0.03832 | 0.02058 | 0.02250 | 0.004571 | 23.57 | 25.53 | 152.50 | 1709.0 | 0.1444 | 0.4245 | 0.4504 | 0.2430 | 0.3613 | 0.08758 | 0.0 |

3 | 11.42 | 20.38 | 77.58 | 386.1 | 0.14250 | 0.28390 | 0.2414 | 0.10520 | 0.2597 | 0.09744 | 0.4956 | 1.1560 | 3.445 | 27.23 | 0.009110 | 0.07458 | 0.05661 | 0.01867 | 0.05963 | 0.009208 | 14.91 | 26.50 | 98.87 | 567.7 | 0.2098 | 0.8663 | 0.6869 | 0.2575 | 0.6638 | 0.17300 | 0.0 |

4 | 20.29 | 14.34 | 135.10 | 1297.0 | 0.10030 | 0.13280 | 0.1980 | 0.10430 | 0.1809 | 0.05883 | 0.7572 | 0.7813 | 5.438 | 94.44 | 0.011490 | 0.02461 | 0.05688 | 0.01885 | 0.01756 | 0.005115 | 22.54 | 16.67 | 152.20 | 1575.0 | 0.1374 | 0.2050 | 0.4000 | 0.1625 | 0.2364 | 0.07678 | 0.0 |

Scatter plots are extremely useful to analyze the relationship between two quantitative variables in a data set. Often datasets contain multiple quantitative and categorical variables and may be interested in relationship between two quantitative variables with respect to a third categorical variable. And coloring scatter plots by the group/categorical variable will greatly enhance the scatter plot.

The relationship between x and y can be shown for different subsets of the data using the hue.

The default treatment of the hue, if present, depends on whether the variable is inferred to represent “numeric” or “categorical” data. In particular, numeric variables are represented with a sequential colormap by default, and the legend entries show regular “ticks” with values that may or may not exist in the data.

```
# Plot scatter plot between mean area and mean smoothness
# hue is target.
sns.scatterplot(x = 'mean area', y = 'mean smoothness', hue = 'target', data = df_cancer)
```

```
# Let's print out countplot to know how many samples belong to class #0 and #1
sns.countplot(df_cancer['target'], label = "Count")
```

To plot multiple pairwise bivariate distributions in a dataset, you can use the pairplot() function. This shows the relationship for (n, 2) combination of variable in a DataFrame as a matrix of plots and the diagonal plots are the univariate plots.

Plot pairwise relationships in a dataset.

By default, this function will create a grid of Axes such that each numeric variable in data will by shared across the y-axes across a single row and the x-axes across a single column. The diagonal plots are treated differently: a univariate distribution plot is drawn to show the marginal distribution of the data in each column.

It is also possible to show a subset of variables or plot different variables on the rows and columns

```
# Plot the pairplot
sns.pairplot(df_cancer, hue = 'target', vars = ['mean radius', 'mean smoothness'] )
```

Heatmap is defined as a graphical representation of data using colors to visualize the value of the matrix. In this, to represent more common values or higher activities brighter colors basically reddish colors are used and to represent less common or activity values, darker colors are preferred. Heatmap is also defined by the name of the shading matrix. Heatmaps in Seaborn can be plotted by using the seaborn.heatmap() function.

```
# Strong correlation between the mean radius and mean perimeter, mean area and mean primeter
plt.figure(figsize = (20, 10))
sns.heatmap(df_cancer.corr(), annot = True)
```

It is used basically for univariant set of observations and visualizes it through a histogram i.e. only one observation and hence we choose one particular column of the dataset.

```
# plot the distplot
# Displot combines matplotlib histogram function with kdeplot() (Kernel density estimate)
# KDE is used to plot the Probability Density of a continuous variable.
sns.distplot(df_cancer['mean radius'], bins = 25, color = 'blue')
```

```
#Plot two separate distplot for each target class #0 and target class #1**
class_0_df = df_cancer[ df_cancer['target']==0 ]
class_1_df = df_cancer[ df_cancer['target']==1 ]
```

```
# Plot the distplot for both classes
plt.figure(figsize=(10, 7))
sns.distplot(class_0_df['mean radius'], bins = 25, color = 'blue')
sns.distplot(class_1_df['mean radius'], bins = 25, color = 'red')
plt.grid()
```

]]>“The grand essentials to happiness in this life are something to do, something to love, and something to hope for.” ~ Washington Burnap

“There’s no secret to it. The trick is just to live.”

"Have an important purpose in life. have an ikigai, but don’t take it too seriously. Be relaxed and enjoy all that you do"

These quotes may help you understand their mindset. Live happily busy is their mantra. So they live long life.

Curious to know what world’s longest-living people eat and drink??

Their diet barely include meat. They eat less than ten grams of salt per day. They follow Hara hachi bu -- stop eating when 80percent stomach is full.

How to face life’s challenges without letting stress and worry age you??

One thing that everyone with a clearly defined ikigai has in common is that they pursue their passion no matter what.

They never give up, even when the cards seem stacked against them or they face one hurdle after another.

Fall seven times, rise eight

Resilience isn’t just the ability to persevere but it is an outlook we can cultivate to stay focused on the important things in life rather than what is most urgent, and to keep ourselves from being carried away by negative emotions.

we all have to face difficult moments, and the way we do this can make a huge difference to our quality of life.

Resilience is our ability to deal with setbacks. The more resilient we are, the easier it will be to pick ourselves up and get back to what gives meaning to our lives.

Resilient people know how to stay focused on their objectives, on what matters, without giving in to discouragement. Their flexibility is the source of their strength: They know how to adapt to change and to reversals of fortune. They concentrate on the things they can control and don’t worry about those they can’t.

our pleasures and desires are not the problem. We can enjoy them as long as they don’t take control of us.

What’s the worst thing that could happen?

The answer to this question may help us develop resilence.

The present is all that exists, and it is the only thing we can control. Instead of worrying about the past or the future, we should appreciate things just as they are in the moment, in the now. We should never forget that everything we have and all the people we love will disappear at some point. This is something we should keep in mind, but without giving in to pessimism. Being aware of the impermanence of things does not have to make us sad; it should help us love the present moment and those who surround us.

Antifragility is beyond resilience. Antifragile are things That Gain from Disorder,

We use the word fragile to describe people, things, and organizations that are weakened when harmed, and the words robust and resilient for things that are able to withstand harm without weakening.

Antifragility is beyond resilience or robustness. The resilient resists
shocks and stays the same; the antifragile gets better.”

Get stronger when harmed.
Steps to build antifragility.

- Step 1: Create redundancies
- Step 2: Bet conservatively in certain areas and take many small risks in others
- Step 3: Get rid of the things that make you fragile

To build resilience into our lives, we shouldn’t fear adversity, because each setback is an opportunity for growth.

If we adopt an antifragile attitude, we’ll find a way to get stronger with every blow, refining our lifestyle and staying focused on our ikigai.

Taleb writes “We need randomness, mess, adventures, uncertainty, self-discovery, hear traumatic episodes, all these things that make life worth living.”

Life is pure imperfection, but if you have a clear sense of your ikigai, each moment will hold so many possibilities that it will seem almost like an eternity.

Once you discover your ikigai, pursuing it and nurturing it every day will bring meaning to your life.

The moment your life has this purpose, you will achieve a happy state of flow in all you do

Be led by your curiosity and intuition, and keep busy by doing things that fill you with meaning and happiness. It doesn’t need to be a big thing.

There is no perfect strategy to connecting with our ikigai. But what we learned from the Okinawans is that we should not worry too much about finding it.

Life is not a problem to be solved. Just remember to have something that keeps you busy doing what you love while being surrounded by the people who love you.

]]>In chapter 3, book introduces us to Logotherapy.

In logotherapy the patient sits up straight and has to listen to things that are, on occasion, hard to hear.

It helps you find reasons to live. Logotherapy pushes you to consciously discover life’s purpose.
The quest to fulfill destiny then motivates to break all the mental chains of the past and overcome whatever obstacles that encounter along the way.

One crazy question can change the way we look at life. Try to ask.

I know this is crazy question to ask but to find ikigai this is necessary.

List down all the point to this question, and those points will become driving force that allows us to achieve our goals.

This question helps us to search for Meaning of life.

Usually frustration arises when our life is without purpose. This frustration when viewed from positive perspective can be catalyst for change. Logotheraphy see frustration as spiritual anguish (spiritual anguish—a natural and beneficial phenomenon that drives those who suffer from it to seek a cure, whether on their own or with the help of others, and in so doing to find greater satisfaction in life.) It helps them change their own destiny. We don’t create the meaning of our life, we discover it. We have capacity to do noble or terrible things. The side of the equation we end up on depends on our decisions, not on the condition in which we find ourselves. If you have a goal to achieve, it will make you persevere towards it.

This meditation centers on three questions the individual must ask himself.

- What have I received from person X?
- What have I given to person X?
- What problems have I caused person X?

Through these reflections, we stop identifying others as the cause of our problems and deepen our own sense of responsibility. Accept that the world is imperfect, but that it is still full of opportunities for growth and achievement.

Next chapter deals with finding flow.

Imagine you are doing your favorite thing. Lets consider playing shooting games. While playing games your entire focus will be on the game. Where will the enemy come, where to shoot, your entire concentration will be on that moment. *There is no future, no past. There is only the present.*
Your body, and your consciousness united as a single entity. You are completely immersed in the experience, not thinking about or distracted by anything else. Your ego dissolves, and you become part of what you
are doing. Before we know it, several hours have passed.

This is the kind of experience Bruce Lee describes as “Be water, my friend.”

The opposite can also happen. When we have to complete a task we don’t want to do, every minute feels like a lifetime and we can’t stop looking at our watch.

That is the power of flow. Experience of being completely immersed in what we are doing and get pleasure, delight, creativity, and process when we are completely immersed in life.

In order to achieve this optimal experience, we have to focus on increasing the time we spend on activities that bring us to this state of flow.

To achieve flow 3 things are important.

- Knowing what to do
- Knowing how to do it
- Knowing how well you are doing

And to make work interesting take on tasks that we have a chance of completing but that are slightly outside our comfort zone.

Activities that are too easy lead to apathy. If a task is too difficult, we won’t have the skills to complete it and will almost certainly give up.

so find ideal middle path, something aligned with our abilities but just a bit of a stretch, so we experience it as a challenge. Add a little something extra, something that takes you out of your comfort zone.

Having a clear objective is important in achieving flow, keep the objective in mind without obsessing over it.

Concentrate on a single task. Concentrating on one thing at a time may be the single most important factor in achieving flow
Sophisticated simplicity, Simplicity and attention to detail. It is not a lazy simplicity but a sophisticated one that searches out new frontiers.

When you get down to work, become one with the object you are creating.

**Microflow**

Enjoying daily boring tasks. if we get bored, we add a layer of complexity to amuse ourselves. Our ability to turn routine tasks into moments of microflow, into something we enjoy, is key to our being happy, since we all have to do such tasks.

The happiest people are not the ones who achieve the most. They are the ones who spend more time than others in a state of flow.

Using flow to find your ikigai.

- What do the activities that drive you to flow have in common?
- Why do those activities drive you to flow?

In the answers to these questions one might find the underlying ikigai that drives our life. If you don’t, then keep searching by going deeper into what you like by spending more of your time in the activities that make you flow. Also, try new things that are not on the list of what makes you flow but that are similar and that you are curious about.

In next part we will see about Resilience.

]]>The book IKIGAI - The Japanese Secret to a Long Happy Life, in first chapter tells importance of the art of staying young while growing old. It tells to not retire and do what you love till your health permits. The concept of retire after a particular age should not stop you from doing what you love. And that comes when you realize your ikigai which gives a sense of purpose to each and every day. Lot of importance is given to helping others and serving the community.

This book also key points for long life.

The keys are diet, exercise, finding a purpose in life (an ikigai), and forming strong social circle of friends and good family relations.

"Hara hachi bu" is a common Japanese saying which means something like fill your belly up to 80%. We should stop eating as we start feeling full. This helps to prevent long digestive processes and accelerate cell oxidation that helps to live a happier life for a longer period.

It also suggests to eat meals in many small plates, which helps to eat less.

In Okinawa moai means the informal groups formed by the people with similar interests that look out for one another. These moai can also help people find their purpose or through them, serving others in the community becomes your Ikigai.

moai is an excellent practice to create great team bonding and also for the overall development of a team or community. When there is such a community bonding, one will never feel left out, or worthless in their life. I loved the fact that in Okinawa, they believe in growing as a community rather than as an individual. This is another secret to a happy life.

In Moai people do monthly contribution to the group which is used for activities. If some member of Moai are in financial trouble, they can get advance from group's savings. This removes financial stress from people and leads to happy life.

In next chapter book talks about **ANTIAGING SECRETS**.

For antiaging you need not take huge steps and completely change your lifestyle but do little things that add up to a long and happy life.

Book explains aging's escape velocity with an example of rabbit.

- Imagine a sign far off in the future with a number on it that represents the age of your death.
- Every year that you live, you advance closer to the sign. When you reach the sign, you die.
- Now imagine a rabbit holding the sign and walking to the future. Every year that you live, the rabbit is half a year as far away. After a while, you will reach the rabbit and die.
- But what if the rabbit could walk at a pace of one year for every year of your life? You would never be able to catch the rabbit, and therefore you would never die.
- The speed at which the rabbit walks to the future is technology.

The more we advance technology and knowledge of our bodies, the faster we can make the rabbit walk. - Aging’s escape velocity is the moment at which the rabbit walks at a pace of one year per year or faster, and we become immortal.

To advance technology we should have active mind and youthful body. Having a youthful mind drives toward a healthy lifestyle that will slow the aging process. Doing daily physical exercise will help.

Stress is main reason for aging. stress promotes cellular aging by weakening cell structures.

But **how does Stress work**??

stress is immediate response to information that is potentially dangerous or problematic.

Stress has a degenerative effect over time. A sustained state of emergency affects the neurons associated with memory, as well as inhibiting the release of certain hormones, the absence of which can cause depression

**How to reduce stress**??

practice mindfulness and focusing on the self by noticing our responses would help. To become mindful, practice meditation, yoga, breathing exercise

But surprisingly a small dose of stress is a positive thing, it will help to face challenges and put our heart and soul into work in order to succeed.

Get good amount of sleep is important. Sleep produces melatonin and this is antioxidant will helps to live longer. Having high degree of emotional awareness is also important.

In next part of this series we will learn about Logo therapy and developing flow in everything we do.

]]>The book is written by Hector Garcia and Francesc Miralles.

This blog is not just review but also my learnings from the book.**Purpose** of writing this blog is to motivate people to read the book, if unable to read then at least read this blog that summarizes learnings from the book.

During lockdown after feeling bored for days, I got suggestion from one of my friend to read this book Ikigai - The Japanese Secret to Long and Happy Life. And it is one of the best suggestion I have got.

This book is all about the purpose of one's life, and how it can lead to happiness. The authors Hector Garcia and Francesc Miralles when learned about Ikigai, decided to find the real meaning and how it works. They decided to visit Okinawa, the island with most centenarians in the world, who believe that their Ikigai is the reason to jump out of bed each morning!. Just like the others, Ikigai was a mystery for me too. Reading about the island and the community, centenarians, left me puzzled and curious to know and understand more about it. Now reading this book was not just a quarantine pass time activity, it was a mystery to be unfolded.

In simple terms, Ikigai means the purpose of your life. Everybody has a purpose in their life. Without a purpose to fulfil, or a goal to chase, life would appear meaningless. Ikigai is a combination of what you love, what are you good at, what world needs, and what you can be paid for. A compound of your passion, profession, mission and vocation.

In short if you want to know how to live long, read this book

This book asks some important questions like

- What is the meaning of my life?
- Is the point just to live longer, or should I seek a higher purpose?
- Why do some people know what they want and have a passion for life, while others languish in confusion?

These questions help us to understand meaning of ikigai.

Once you understand the meaning of the word Ikigai, the book tries to explain

- the deep art of staying young while growing old.
- It gives the references of 5 Blue Zones in the world where residents of these places live longer than average and secrets of their long life.
- This book explains how stress, a lot sitting adds up to your age and makes reduces your life duration.
- Interestingly it also mentions very prominently that little stress is good for you since it keeps you going.
- it helps you deep dive into discovering the meaning of your life.
- Finding flow in everything that you do
- The book also talks about some very interesting techniques to practice which can help you achieve the flow.
- Highlights importance of resilience
- Finally gives 10 rules of ikigai

This book concludes by saying Life is pure imperfection, but if you have a clear sense of your ikigai, each moment will hold so many possibilities that it will seem almost like an eternity.

This is First part of series, I will write few more parts of blog to summarize my learning from each chapters. You can find those in series here ,

please do subscribe to newsletter for updates.

]]>Click here for notebook.

Just change the input and check the output.

Learning by experiment and hands-on exercises is always better.

The purpose of this notebook is just to revise python basics.

Let's get started.

NumPy is a Linear Algebra Library used for multidimensional arrays

NumPy brings the best of two worlds:

- C/Fortran computational efficiency,
- Python language easy syntax

```
import numpy as np
# Let's define a one-dimensional array
my_list = [10, 20, 30, 40, 50, 60, 70, 80]
my_list
```

```
[10, 20, 30, 40, 50, 60, 70, 80]
```

Let's create a numpy array from the list "my_list"

```
x = np.array(my_list)
x
```

```
array([10, 20, 30, 40, 50, 60, 70, 80])
```

Get shape

```
x.shape
```

```
(8,)
```

Let's create a Multi-dimensional numpy array from the list "my_list"

```
matrix = np.array([[5, 8], [9, 13]])
matrix
```

```
array([[ 5, 8],
[ 9, 13]])
```

```
# "rand()" uniform distribution between 0 and 1
xy = np.random.rand(7)
xy
```

```
array([0.40408966, 0.12527144, 0.04465052, 0.39450693, 0.93339664,
0.14009694, 0.94461679])
```

you can create a matrix of random number from random.rand

```
xy = np.random.rand(2, 2)
xy
```

```
array([[0.86152202, 0.22526627],
[0.41562272, 0.33467273]])
```

```
# "randn()" normal distribution between 0 and 1
xy = np.random.randn(7)
xy
```

```
array([-1.27678101, 1.20667812, 0.7945132 , 0.62421099, -0.44447512,
-0.57038096, 2.19949273])
```

"randint" is used to generate random integers between upper and lower bounds

```
xy = np.random.randint(1, 10)
xy
```

```
9
```

Create an evenly spaced values with a step of 7

```
xy = np.arange(1, 50, 7)
xy
```

```
array([ 1, 8, 15, 22, 29, 36, 43])
```

```
# Array of ones
xy = np.ones(7)
xy
```

```
array([1., 1., 1., 1., 1., 1., 1.])
```

```
# Matrices of ones
xy = np.ones((2, 2))
xy
```

```
array([[1., 1.],
[1., 1.]])
```

```
# Array of zeros
xy = np.zeros(5)
xy
```

```
array([0., 0., 0., 0., 0.])
```

Reshape 1D array into a matrix

```
z = x.reshape(2,4)
print(x)
print(z)
```

```
[10 20 30 40 50 60 70 80]
[[10 20 30 40]
[50 60 70 80]]
```

Obtain the maximum element (value)

```
x.max()
```

```
80
```

Obtain the minimum element (value)

```
x.min()
```

```
10
```

Obtain the location of the max element

```
x.argmax()
```

```
7
```

```
# Obtain the location of the min element
x.argmin()
```

```
0
```

```
# Access specific index from the numpy array
x[0]
```

```
10
```

```
# Starting from the first index 0 up until and NOT including the last element
x[0:3]
```

```
array([10, 20, 30])
```

```
# Broadcasting, altering several values in a numpy array at once
x[0:2] = 10
x
```

```
array([10, 10, 30, 40, 50, 60, 70, 80])
```

Pandas is a data manipulation and analysis tool that is built on Numpy.

Pandas uses a data structure known as DataFrame (think of it as Microsoft excel in Python).

DataFrames empower programmers to store and manipulate data in a tabular fashion (rows and columns).

Series Vs. DataFrame? Series is considered a single column of a DataFrame.

```
import pandas as pd
```

```
# Let's define two lists as shown below:
stock_list = ['Reliance','AMAZON','facebook']
stock_list
```

```
['Reliance', 'AMZN', 'facebook']
```

```
label = ['stock#1', 'stock#2', 'stock#3']
label
```

```
['stock#1', 'stock#2', 'stock#3']
```

Let's create a one dimensional Pandas "series"

Note that series is formed of data and associated labels

```
x_series = pd.Series(data = stock_list, index = label)
```

```
# Let's view the series
x_series
```

```
stock#1 Reliance
stock#2 AMZN
stock#3 facebook
dtype: object
```

Let's obtain the datatype

```
type(x_series)
```

```
pandas.core.series.Series
```

Let's define a two-dimensional Pandas DataFrame

Note that you can create a pandas dataframe from a python dictionary

```
bank_client_df = pd.DataFrame({'Bank client ID':[1111, 2222, 3333, 4444],
'Bank Client Name':['Kiran', 'Chaitanya', 'dheeraj', 'shreyas'],
'Net worth [$]':[3500, 29000, 10000, 2000],
'Years with bank':[3, 4, 9, 5]})
bank_client_df
```

Bank client ID | Bank Client Name | Net worth [$] | Years with bank | |
---|---|---|---|---|

0 | 1111 | Kiran | 3500 | 3 |

1 | 2222 | Chaitanya | 29000 | 4 |

2 | 3333 | dheeraj | 10000 | 9 |

3 | 4444 | shreyas | 2000 | 5 |

Let's obtain the data type

```
type(bank_client_df)
```

```
pandas.core.frame.DataFrame
```

you can only view the first couple of rows using .head()

```
bank_client_df.head(2)
```

Bank client ID | Bank Client Name | Net worth [$] | Years with bank | |
---|---|---|---|---|

0 | 1111 | Kiran | 3500 | 3 |

1 | 2222 | Chaitanya | 29000 | 4 |

you can only view the last couple of rows using .tail()

```
bank_client_df.tail(1)
```

Bank client ID | Bank Client Name | Net worth [$] | Years with bank | |
---|---|---|---|---|

3 | 4444 | shreyas | 2000 | 5 |

bank_df = pd.read_csv('sample.csv')

bank_df.to_csv('sample_output.csv', index = False)

```
df1 = pd.DataFrame({'A': ['A0', 'A1', 'A2', 'A3'],
'B': ['B0', 'B1', 'B2', 'B3'],
'C': ['C0', 'C1', 'C2', 'C3'],
'D': ['D0', 'D1', 'D2', 'D3']},
index=[0, 1, 2, 3])
```

```
df1
```

A | B | C | D | |
---|---|---|---|---|

0 | A0 | B0 | C0 | D0 |

1 | A1 | B1 | C1 | D1 |

2 | A2 | B2 | C2 | D2 |

3 | A3 | B3 | C3 | D3 |

```
df2 = pd.DataFrame({'A': ['A4', 'A5', 'A6', 'A7'],
'B': ['B4', 'B5', 'B6', 'B7'],
'C': ['C4', 'C5', 'C6', 'C7'],
'D': ['D4', 'D5', 'D6', 'D7']},
index=[4, 5, 6, 7])
```

```
df2
```

A | B | C | D | |
---|---|---|---|---|

4 | A4 | B4 | C4 | D4 |

5 | A5 | B5 | C5 | D5 |

6 | A6 | B6 | C6 | D6 |

7 | A7 | B7 | C7 | D7 |

```
df3 = pd.DataFrame({'A': ['A8', 'A9', 'A10', 'A11'],
'B': ['B8', 'B9', 'B10', 'B11'],
'C': ['C8', 'C9', 'C10', 'C11'],
'D': ['D8', 'D9', 'D10', 'D11']},
index=[8, 9, 10, 11])
```

```
df3
```

A | B | C | D | |
---|---|---|---|---|

8 | A8 | B8 | C8 | D8 |

9 | A9 | B9 | C9 | D9 |

10 | A10 | B10 | C10 | D10 |

11 | A11 | B11 | C11 | D11 |

```
pd.concat([df1, df2, df3])
```

A | B | C | D | |
---|---|---|---|---|

0 | A0 | B0 | C0 | D0 |

1 | A1 | B1 | C1 | D1 |

2 | A2 | B2 | C2 | D2 |

3 | A3 | B3 | C3 | D3 |

4 | A4 | B4 | C4 | D4 |

5 | A5 | B5 | C5 | D5 |

6 | A6 | B6 | C6 | D6 |

7 | A7 | B7 | C7 | D7 |

8 | A8 | B8 | C8 | D8 |

9 | A9 | B9 | C9 | D9 |

10 | A10 | B10 | C10 | D10 |

11 | A11 | B11 | C11 | D11 |

Click here for notebook.

Just change the input and check the output.

Learning by experiment and hands-on exercises is always better.

The purpose of this notebook is just to revise python basics.

Let's get started.

Get your name as an input and print it out on the screen

```
name = input("Welcome! Welcome! Welcome!, What's your name: ")
print('Hello', name)
```

```
Welcome! Welcome! Welcome!, What's your name: Kiran
Hello Kiran
```

```
# Booleans behave like integers 0 and 1.
True
```

```
True
```

```
urMoney = 100
friendsMoney = 200
print(urMoney == friendsMoney)
```

```
False
```

A list is a collection which is ordered and changeable.

List allows duplicate members.

```
my_list = ['Hello', 'Everyone', 'and', 'Welcome', 'to', 'Python', 'basics']
my_list
```

```
['Hello', 'Everyone', 'and', 'Welcome', 'to', 'Python', 'basics']
```

```
# Obtain the datatype
type(my_list)
```

```
list
```

```
# list with mixed datatypes
# (for example you can have strings and integers in one list)
# You can have a list inside another list (nested list)
my_list = ["GOOG", [3, 6, 7],"GOOG", [3, 6, 7],"GOOG", [3, 6, 7]]
my_list
```

```
['GOOG', [3, 6, 7], 'GOOG', [3, 6, 7], 'GOOG', [3, 6, 7]]
```

```
# Access specific elements in the list with Indexing
# Note that the first element in the list has an index of 0 (little confusing but you'll get used to it!)
my_list[1]
```

```
[3, 6, 7]
```

```
# List Slicing (getting more than one element from a list)
# obtain elements starting from index 3 up to and not including element with index 6
print(my_list[3:6])
```

```
[[3, 6, 7], 'GOOG', [3, 6, 7]]
```

```
# Obtain the length of the list (how many elements in the list)
len(my_list)
```

```
6
```

my_dict = {'key1':'value1', 'key2':'value2', 'key3':'value3'}

Dictionary consists of a collection of key-value pairs. Each key-value pair maps the key to its corresponding value.

Keys are unique within a dictionary while values may not be.

List elements are accessed by their position in the list, via indexing while Dictionary elements are accessed via keys

```
# Define a dictionary using key-value pairs
my_dict = {'key1':'value1',
'key2':'value2',
'key3':'value3'}
```

```
# Check the data type
type(my_dict)
```

```
dict
```

```
# Access specific element in the dictionary using a specific key (ex: Key2)
my_dict['key2']
```

```
'value2'
```

A string in Python is a sequence of characters

String can be enclosed by either double or single quotes

```
my_string = "Hello Everyone and Welcome to Python basics"
my_string
```

```
'Hello Everyone and Welcome to Python basics'
```

Split is used to divide up the string into words

The output from split is a list

```
x = my_string.split()
x
```

```
['Hello', 'Everyone', 'and', 'Welcome', 'to', 'Python', 'basics']
```

A tuple is a sequence of immutable Python objects.

Tuples are sequences, just like lists.

The differences between tuples and lists are, the tuples cannot be changed unlike lists and tuples use parentheses, whereas lists use square brackets.

```
tuple_1 = ('GOOGLE', 'APPLE', 10, 15, 1992);
tuple_2 = (450, 55, 977, 2100);
type(tuple_1)
```

```
tuple
```

```
# Accessing elements in a tuple
tuple_1[1]
```

```
'APPLE'
```

```
# Changing a vlue of a tuple does not work!
# 'tuple' object does not support item assignment
tuple_1[1] = 0
```

```
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-18-0d93d25c1b58> in <module>()
1 # Changing a vlue of a tuple does not work!
2 # 'tuple' object does not support item assignment
----> 3 tuple_1[1] = 0
TypeError: 'tuple' object does not support item assignment
```

A set is an unordered collection of items. Every element is unique (no duplicates).

A set is created by placing all the items (elements) inside curly braces {}, separated by comma or by using the built-in function set().

```
my_set = {'GOOGLE', 'APPLE', 'Jio'}
print(my_set)
```

```
{'Jio', 'GOOGLE', 'APPLE'}
```

```
# set do not have duplicates
my_set = {'GOOG', 'APPL', 'T','TSLA','T','AAPL'}
print(my_set)
```

```
{'GOOG', 'T', 'TSLA', 'AAPL', 'APPL'}
```

- A simple if-else statement is written in Python as follows:

```
if condition:
statement #1
else:
statement #2
```

- If the condition is true, execute the first indented statement
- if the condition is not true, then execute the else indented statements.
- Note that Python uses indentation (whitespace) to indicate code sections and scope.

```
if 10 > 9:
print('If condition is True')
else:
print('If condition is False')
```

```
If condition is True
```

For loops are used for iterating over a sequence (a list, a tuple, a dictionary, a set, or a string).

An action can be executed once for each item in a list, tuple of the for loop.

```
for i in my_list:
print(i)
```

```
GOOG
[3, 6, 7]
GOOG
[3, 6, 7]
GOOG
[3, 6, 7]
```

range() is 0-index based, meaning list indexes start at 0, not 1.

The last integer generated by range() is up to, but not including, last element.

Example: range(0, 10) generates integers from 0 up to, but not including, 10.

```
for i in range(6):
print(i)
```

```
0
1
2
3
4
5
```

While loop can be used to execute a set of statements as long as a certain condition holds true.

```
i = 0
while i <=7:
print(i)
i = i + 1
```

```
0
1
2
3
4
5
6
7
```

```
# print out all elements in the list up until the Python (Ticker Symbol = Python) is detected
for i in my_list:
print(i)
if i == 'Python':
break
```

```
GOOG
[3, 6, 7]
GOOG
[3, 6, 7]
GOOG
[3, 6, 7]
```

```
# Print only odd elements and skip even numbers
No_list = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
for i in No_list:
if i%2 == 0:
continue
print(i)
```

```
1
3
5
7
9
```

Instead of using loops and append, list comprehension is used to iterate over a list, condition its elements and include them in a new list.

```
input_list = [1, 2, 3, 4]
[ element ** 2 for element in input_list]
```

```
[1, 4, 9, 16]
```

Lambda function is used to create a function without a name

Lambda functions are mainly used with filter() and map()

Lambda function can receive any number of arguments, but can only have one expression.

```
# We can do the same task using Lambda expression
# Note that there is no function name
y = lambda x:x**2
```

map() performs an operation on the entire list and return the results in a new list.

```
# Define two lists a and b
a = [1, 4, 5, 6, 9]
b = [1, 7, 9, 12, 7]
```

```
# Let's define a function that adds two elements together
def summation(a, b):
return a + b
```

```
# You can now use map() to apply a function to the entire list and generate a new list
c = list(map(summation, a, b))
c
```

```
[2, 11, 14, 18, 16]
```

```
prices = [105, 5055, 40, 356, 923, 1443, 222, 62]
```

```
# return only even numbers
out = list(filter( lambda x: (x % 2 == 0) , prices ))
out
```

```
[40, 356, 222, 62]
```

- open() is the key function to handle files
- open() function takes two parameters: (1) filename (2) mode.
Modes for files opening:

- "r" - Read - Default value. Opens a file for reading, error if the file does not exist
- "a" - Append - Opens a file for appending, creates the file if it does not exist
- "w" - Write - Opens a file for writing, creates the file if it does not exist
- "x" - Create - Creates the specified file, returns an error if the file exists

files can be used in a binary or text mode

- "t" - Text - Default value. Text mode
- "b" - Binary - Binary mode (e.g. images)

f = open('sample_file.txt', 'r')

print(f.read())

outputs the content of sample file

print(f.readline())

f.write('this is a new line that I added to the file!')

f.close()

]]>What is **Law of Large numbers**????

Let us understand with an example.

Consider a city with two hospitals. One, with about 15 births per day and one with about 45 births per day. There are about 50% of babies born who were boys and 50% who were girls. So the probability is 50%.

Which of these hospitals, do you think has more days when 60% or more of the babies born are boys -- the hospital with 15 births per day or the hospital with 45 births per day?

Most people think there would be no difference. That there would be the same number of days in a year at the 15 birth per day hospital and the 45 birth per day hospital, when 60% or more of the babies born are boys.

But that is **not true**.

To understand the difference, we need to know Law of large numbers. Consider hospital with 15 births. There's nine boys and six girls, so that's 60% boys.

Consider hospital with 45 births per day. That's 27 boys born and 18 girls born, that's 60% boys.

But isn't hospital with 45 births with 60% boys an unusual pattern to get, given that we know that there's a 50/50 ratio of boys and girls. Drawing from a 50/50 distribution, if you got 27 boys and 18 girls that's only going to happen 3 in 100 times.

It could happen, but it is pretty darned unlikely.

Now, what's going on here?

The principal that we use to understand this is the **Law of large numbers**, which says that, sample values, for example, proportions, resemble population values as a function of their size.

The larger the sample, the less likely it is that you will get a fluke, a very unrepresentative value. So 60% is a very unrepresentative value for 50%, which is close to the true value. But it's common to get that kind of difference with a small sample. If the sample gets large enough it becomes virtually impossible.

So you're going to get 60% or more boys at a hospital with 15 births every few days. You'll get 60% or more boys at a hospital with 45 births, maybe nine or ten times a year.

Let us consider the example of hiring for a job. Suppose you interview a man with a great record, a terrific recommendation from his previous employers, but in the interview the fellow didn't do great. We generally tend to reject that guy.

That sounds like the kind of thing that happens all the time, right? But is the judgement really a reasonable one?

I don't want to tell my opinion, instead let us see what law of large numbers or in general Statistics tells about this.

To help you think about that, suppose a soccer coach is looking for a striker, and he goes to a practice for a high school kid, who has a great scoring record and terrific reviews from his coaches. But at this practice, the guy misses some easy points and he just doesn't seem in control of the ball. So if coach thinks that the kid shouldn't be pursued.

Now, is the coach's judgment reasonable or not?

People who know sports are quite likely to say no that's really not so reasonable.

One practice, it's just not that much evidence, there's lots of variability. Kid could have an off day or any other reason.

Considering this scenario let's go back to hiring example. Do you still think decision is reasonable?

Well, the 30 minute unstructured interview is not that much evidence.
In fact, people have looked at how well you can predict performance in college, to see how well that interview rating predicts performance. And the correlation almost never exceeds 0.10, that's very, very small.

That's equivalent to increasing the likelihood of hiring the better of two candidates from 50/50, which is what you would get if you were going to flip a coin to make the decision, to a 53% chance.

If you have past performance and other judgments by other people, you can do quite well in predicting these same kinds of performance that I just mentioned. In fact, if you weight things properly that are in the folder, you can raise the chances of picking the right person to 65% or 75%. So this is the principle of law of large numbers.

To say that sample values for events having a **chance component** resemble population values for those events as a function of their size. In fact, the law of large numbers only applies where there is some kind of a chance component. There's no chance component to your measurement of how far it is between two cities, and the law of large numbers is irrelevant there. One measurement will do you, but sports or job interview are very variable.

We know that performance can be much higher or much lower on any given occasion. But most people never observe all that many interviews. And you don't get to see how well the prediction from the interview corresponds to performance of the person on the job.

The truth is the employer's judgment is even worse than the coach's judgement because interviews are not a sample of job performance or school performance. They're a sample of interview performance.

Interviews and performance on the job require different skills.

So the law of large numbers applies to all kinds of events in everyday life.

We apply the law of large numbers when we see the variability, the error, but not for events that are just as important where we don't. The principle, here, is if your variable is human behavior, assume there is error variance, and adjust your judgement accordingly.

Credit: Thanks to Richard E. Nisbett for Mindware: Critical Thinking for the Information Age course on Coursera.

]]>A variable is something that varies, as opposed to a constant.

For example, The current temperature is a value for the variable temperature 27, 30, 19, 29. As opposed to freezing temperature, which is constant, it's always the same.

Anything that varies about a thing, event or person can be a variable.
Variables are distributed in some way. One of the ways they're most frequently distributed is the normal distribution pattern.

In this kind of distribution, the mean is the most common value, and that's in the middle. As you get further and further away, from the mean, cases become rarer and rarer.

Consider example of distribution of cooking ability. Think of somebody who's not such a good cook, maybe your friend. And then, think of somebody who's a great cook, maybe your grandma. And then think of somebody who's at the mean is the average. And there's lots of average people compared to good old grandma's cooking and your poor friend's cooking. Cases become fewer and fewer as you get further from the mean.

An important point about the normal distribution, is that it can be described in terms of standard deviations from the mean. That's almost like the average deviation but not quite.

The average American male is a little less that 5'10". And the average deviation is a little less than 3", and so is the standard deviation.

- Standard deviation fact number one is that 68% of all cases, ex: male heights, are within -1 standard deviation and +1 standard deviation. So slightly more than two-thirds of all American males are between 5'7" and 6'1".
- Standard deviation fact number two is that 84% of all cases are between, The bottom of the distribution to 1 standard deviation, there you find 84% of cases. So 84% of all males are less than 6'1", and about 16% of American males are taller than 6'1".
- Standard deviation fact number three is that 96% of all cases lie between -2 standard deviations and +2 standard deviations. So 96% of all American males are taller than 5'4" and shorter than 6'4".
- Standard deviation fact number four is that you can convert standard deviations to percentiles. The mean is always at the 50th percentile. 1 standard deviation is always at the 84th percentile. And 83% of cases are below the 84th percentile. And 16% of cases are above, the 84th percentile.

For example, imagine you've designed a new way of teaching algebra. Kids taught by the old method get 72 on the exam and kids taught by the new method get 78 on the exam. Is that a big deal or not?

It completely depends on the standard deviation. So the mean is 72, if the standard deviation is 6, that's 78. That's a big gain, because that takes the average kid from the 50th percentile to about the 84th percentile, which is no joke.

On the other hand, assume that the standard deviation is 18. If so, it's not such a big deal. Because the gain is only one-third of a standard deviation, which is the equivalent of going from the 50th percentile to just the 64th percentile, which is not such a big deal. And you might want to take into consideration whether there are added costs if that's all the gain you're getting.

Correlation measures the association between variables.

To give some examples, let's consider example of cooking ability which we saw earlier. And we're going to relate that to age. And we already know two points here.

Your friend is young and is not a very good cook and Grandma is old and she's a great cook. And if you were to collect additional data, you would find a tendency for people who are not such great cooks. are going to be youngish and people who are better than the Average are going to be relatively older. And that gives us the correlation.

Correlations range between minus one and plus one.

A minus one correlation means that there is a perfect correlation such that the higher you go on the x variable, the lower you go on the y variable.

At the other end, correlation of +1.0 indicates that the higher you go on the variable x, the higher you go on the variable y.

A correlation of -1 is equal to a correlation of +1, they are just in different directions

There are two basic ways of looking at correlations.

**Rank order correlation**.

A rank-order correlation is a correlation between two variables whose values are ranks.

When variables are measured at least on ordinal scales, units of observation (e.g., individuals, nations, organizations, values) can be ranked.

A ranking is an ordering of units of observations with respect to an attribute of interest.

For example, nations can be ranked with respect to their quality of life, their freedom, etc. A rank is the position of a unit of observation (e.g., nation) in the ranking. Units of observation with higher ranks show the attribute of interest to a higher degree.

If one is interested in the association between two rankings (e.g., quality of life and freedom of nations), rank-order correlations can be calculated.

Correlations are the way we assess **reliability of measures**.

There's two different ways to define reliability, one is it's the degree to which a measure of a particular variable gives the same value across occasions. Or the degree to which a measure correlates with itself.

So as an example, you can have the correlation between measures of height taken on different occasions and you would expect that correlation.

Correlations also are the way that we measure **validity of measures**.
Validity is the degree to which a variable measures.
There are two very important points about the **relationship between validity and reliability**.

The first point is that there can be no validity if there is no reliability.

If your measure gives you a different score every time and they're more or less random, so that you get a high score one measurement and a low score on another. And your friend gets a high score on one and a low score on the other and there's no relationship at all, then you can't have any validity for that measure.

There has to be some stability, some degree of getting the same answer twice before you can have any validity for that measure at all.A second point is that reliability implies very little about validity. Now, if reliability is zero, there can't be any validity. But at the other extreme, reliability can be absolutely perfect, but there may be no validity.

Credit: Thanks to Richard E. Nisbett for Mindware: Critical Thinking for the Information Age course on Coursera.

]]>August 22, 2020

I had written this post long ago and wanted to publish, but waited for fastai2 and fastai course 2020 release so that my experience might help new people joining the course. fastai 2020 course is released. I welcome everyone who are starting the fastai course 2020 to fastai family.

Thank you Jeremy Howard and Rachel Thomas for their great contribution to the democratization of Deep Learning.

I have completed fastai part1(2019) part2 and very much impressed by the course. It is one of the best courses available to learn deep learning. And it is free. What more do anyone need to get started with deep learning?

**This is just my experience and what I feel about the course. So don’t get me wrong, I am not discouraging to join other courses in market but My true intention writing this blog is to tell everyone to join and learn from fast.ai course and whatever may come your way, Just complete the course.**

My **glimpses** of fastai courses come 1 year ago when a friend suggested me fastai course. I first went through a few lessons after lesson 3 everything went above my understanding. Because I was used to the university bottom-up approach and went to learn how every line of code works in the first attempt. Jeremy had warned to follow his words carefully. I missed on that. I want to summarise the mistakes that I had made when getting started so that hopefully, you will avoid making the same when learning from the fast.ai course.

At my first attempt, I discontinued from fastai course feeling it’s difficulty after lesson 3 and started to learn deep learning from Coursera, Udacity courses. I completed few Coursera and Udacity courses. Until 7 months back when I wanted to build some working model on my own, I was able to do a very basic model and it was far away compared to the state of art results. At this point, I could remember the words of Jeremy “build state of art model” and wanted to explore the top-down approach. Surprisingly top-down approach suites me better than bottom-up. And this time I wanted to complete fastai course, and finish what I had left incomplete.

Since now I had the experience of going through boring, theoretical courses, fastai seemed more interesting, and Jeremy’s humor was the best part of his courses. Even though I realized this was the best course on the internet and went through it, I was stuck again after lesson 4. But this time I did not want to leave the course. So I went through forum and every blog on the internet to know how to people have previously completed fastai course. Then I finally realized the way to complete and understand fastai course is just **the way Jeremy tells you to do it**.

Haha, I know this seems confusing, but let me explain!!!

When Jeremy says we will learn in further lessons means just that, don’t break your head on that concept and be in the mindset of learning everything on day 1. Believe and you will understand that in future lessons. You surely won’t understand in the first attempt( at least I didn’t) but after watching it, 4 to 5 times, maybe you will. We are all so enthusiastic that we usually want to learn every line of code(which is not bad but didn’t work for me), we just forget to listen to Jeremy’s words. So the best way to complete the course is to surrender your mind to Jeremy and **listen to every word what he tells, and just follow it**.

So what next after you complete course. Don’t worry, there is another part. And this repeats every year, 2 parts of the course and with 6 months gap in between.

6 Months!!! Don’t worry, you need that time to digest the content of the course and explore all techniques on a new dataset and do some fun projects.

- surrender your mind to Jeremy and listen to every word what he tells, and just follow it.
- People will suggest you and you may also feel to try another beginner-friendly course, but don’t quit fastai course, be persistent and finish what you started.
- People may tell you, need to be able to read a lot of papers, it’s correct. But the approach matters. At first, read what you can in the paper. convert math to code. And as you go you will get better.
- Watching the lectures DOES NOT EQUATE TO DOING Deep learning. So write all code in the lesson, try to apply on a different kind of dataset.
- At first, when you start, watch the lecture once. It’s okay, you won’t understand everything, you don’t have to either. But the important thing is to complete.
- Code as much as you can.

Recently when I was watching part 2 of course for 3rd time, I had a new perspective of looking at fastai course.

I have gone through part1 of course nearly 5times and part2 of course 3 times. All the terminologies like epochs, learning rate, momentum, fit-one-cycle just not apply dataset but also a way of learning fastai course.

I just wanted to connect technical concepts to way we learn. So this is just for fun:)

Consider **epochs** as a number of times you need to watch and learn from fastai course.

**Learning rate** as a factor to learn the course. **Discriminative learning rate** may be treated as when you start the course you take small steps, and as you feel comfortable you take larger steps and as the concept becomes difficult you once again take small steps.

**Momentum**: after you complete the course once, you cover easy concepts of the course faster.

**fit one cycle** as complete course fully once and repeat for required epochs.

**weights** maybe your understanding with deep learning concepts before you start this course it may be zero or even pre-trained if you have done other courses.

As you learn the course you weights get changed from **backpropagation**. As we fully understand the course we get better accuracy.

Augmentation, dropout is also required to increase your understanding i.e. accuracy.

**dropout** can be treated as highly tough or things you don’t want to get into in the beginning. These concepts will be drop out in the first iteration(even though it is not exactly dropout)

**Augmentation** can be treated as learning the same concepts in lessons on different datasets.

**I did not know how to connect concepts of the loss function, optimizer and most importantly the help from forums into this crazy perspective. If you can help me connect the dots. I would be grateful.**

I would love to hear feedback, your perspective. You can tweet @UKamath7 or comment on kaggle, reddit or medium.

*Originally published at* *https://kirankamath.netlify.app**.*

Let us see how definition differentiates between the two.

“**Invention**” can be defined as the creation of a product or introduction of a process for the first time.

“**Innovation**” occurs if someone improves on or makes a significant contribution to an existing product, process or service.

Innovations are usually based on some invention, which means to say Innovation flows from the invention.

According to Horace Dediu "Novelty: Something new Creation: Something new and valuable Invention: Something new, having potential value through utility Innovation: Something new and uniquely useful"

Consider the Arduino. Someone invented the Arduino. But by itself, the Arduino was nothing more than another piece of tech. It’s what was done with that piece, the hundreds of products, processes and services that evolved from the invention of the Arduino that required innovation. Arduino boards are able to read inputs - light on a sensor, a finger on a button, or a Twitter message - and turn it into an output - activating a motor, turning on an LED, publishing something online. All these were possible with innovating products with Arduino. Innovations are therefore the most demanding works.

The invention is said to be the creation of new things. The smartphone, the car, the desktop computer are inventions. Innovation is the continual upgrade of inventions. If you consider sending messages through phones as invention then **Whatsapp** is innovation. All features you get with each upgrade are innovation. Innovating isn't the same as inventing. Inventing is creating something that hasn't existed before while innovating is changing or combining things - usually for commercial benefit.

Consider the example of **Apple**'s iPhone. We can treat the iPhone as obviously, at first, an invention, and the subsequent upgrades to it are innovation. But according to Tim Worstall, the iPhone itself was both invention and innovation, yes, because it is in part derivative of other earlier technologies. Apple didn't invent GPS for example, but incorporating it into the phone is innovation, not invention.

Was the iPhone a great invention? **NO**.

We can dissect the iPhone into individual inventions. Like camera, GPS, calling service.

Was the iPhone a great innovation? **Absolutely YESS**.

The iPhone created an ecosystem of media content, telecommunications, licensing, application development, and unified them all under one roof. The iPad grew on that success and created a new “screen” to expand the mobile and personal experience.

Apple has made products that have owned and even defined its category. But they invented less and combined inventions to innovate new products. This is one of the reasons technology grows at an exponential rate - it builds on previous technology, rather than having to re-invent at every step. Obviously the inventors need to be properly recognized and rewarded - but the real **game-changers** are the innovators.
Invention without innovation can just create toys. You may fulfill a personal desire to build but may not deliver a return on the investment of time and resources.

**Edison didn't invent the lightbulb**

Though he is often (falsely) credited for inventing the lightbulb, even I read about it today. check this here. Thomas Edison did not create it. What he did was improve it **significantly** and make it commercially viable. His lightbulb **was an innovation**.

There were up to 20 inventors who created an incandescent lamp before Edison; but his was much, much better. Edison lamp was widely adopted and became the lightbulb - also making it too hard for the others to compete with.
Game-changing products are often innovations not inventions( and there are always exceptions but generally speaking).
Today it's **easier to innovate than ever before**. Innovating today is easier because of the internet and platforms available.

Given the choice to invent or innovate, most entrepreneurs would take the latter. Let’s face it, innovation is just sexier. Perhaps there are a few people who know who invented lens or GPS or LED or the camera. But virtually everyone on the planet knows who Steve Jobs is.

While we interchangeably use these words, actually “invention” and “innovation” are not the same thing. There are distinctions between them, and those distinctions are important. And I have tried to point out the differences. So how do you know if it is invention or innovation? Consider this analogy:

If the invention is lighting a lamp, innovation is the brightness effect that light from lamp causes so that we can see the room. If Invention is the bulb then innovation is a product that makes the bulb to give more light or with more efficiency. Using that same bulb technology to build different designs of bulbs for different purposes. Someone has to light the lamp. That’s the inventor. Someone has to recognize the lamp will eventually light the room and finds ways to make it better. That’s the entrepreneur. I have not invented this blog, actually, I have read a dictionary and many articles to know the difference and put all together in my style which means I have innovated this blog. Though I have created, it is innovation.

If you have a different view I'd like to hear your comments. You can either tweet or comment on Kaggle or Reddit.

]]>July 25, 2020

I got invited to Google foobar challenge on 25th March 2020, and worked on it and completed till Level 4 by 2nd May 2020. I wrote this blog( experience) month back but got busy with other work so couldn’t publish, so publishing it today._

**#Blog20**

Photo by Christian Wiediger on Unsplash

Google has a secret way of hiring, Yess you read it right!!!

The Google Foo bar challenge is a secret process of hiring developers and programmers all over the world. Google uses this to hire some of the best developers around the globe which they think can be a good match for their organization. It is a secret process and the challenge consists of coding challenges of increasing difficulty as you go along. The Google foobar page is not accessible to everyone. Google sends an invitation and only those programmers get an opportunity to participate in the foo bar challenge.

Google Foobar is an invitation based hiring challenge. Means, you can only take the challenge if you have got the invitation from Google for Foobar. The unique thing about the Foobar challenge is that **it** finds **you**, and not the usual another way around.

So let me take you through my experience in the Google foobar challenge.

I was searching for python assert and the page loads…

suddenly the browser window splits open and I see this

“You’re speaking our language. Up for a challenge?”

And there was a link and I opened it and it redirected me to a new page ‘Foo.bar’

It’s kind of like a Linux console( UNIX command line like interface) where you get to solve the problems one by one upon opening a new problem directory.

(I took this photo from my phone)

There is some text at the top of the screen:

“Google has a code challenge ready for you”

Just below, there is a paragraph of blueish text that sets the stage for a sci-fi adventure:

“Success! You’ve managed to infiltrate Commander Lambda’s evil organization, and finally earned yourself an entry-level position as a Minion on her space station. From here, you just might be able to subvert her plans to use the LAMBCHOP doomsday device to destroy Bunny Planet. Problem is, Minions are the lowest of the low in the Lambda hierarchy. Better buck up and get working, or you’ll never make it to the top…”

There was also note that if you leave the page it will expire, so I signed in.

There was a Readme file so open the readme file and that is how challenge began.

The challenge consists of 5 levels consisting of algorithm problems. I won’t share the problems, neither the solutions as it would be unfair.

But it was really great experience so far.

The first few levels were relatively easy, but as the levels peaked up, the difficulty gained heights. I have completed level 4 and I couldn’t complete level 5. The fourth level was quite intense! I was given weeks for each problem. The problems needed multiple concepts ranging from number theory to graphs. I somehow completed it.

Upon completing level 3, I had to submit my personal details in the console.

I completed till Level 4 which took me roughly one and a half months, but Level 5 problem was difficult and I could not crack the logic.

In foobar challenge time is not constraint till level 3 but if you are not working on the challenge every day then time can become constraint after level 3. The same happened to me. Since challenges in Level 1,2,3 were easy that I completed within a day, but I did not request a new problem(to go to the next level), until I was free which meant I did not take it seriously. But Level 4 was difficult, I was given weeks for each problem. I took weeks to complete the problem. Now I felt the difficultly in problem and need to work on problem for a few hours each day. The problems needed multiple concepts ranging from number theory to graphs. I somehow completed it.

Level 5 was more difficult than level 4. I was given weeks for that problem and forgot about the challenge, when I opened foobar I had only 2 days to complete and just lost in Level 5.

So if you get an opportunity to participate in the Google foo bar challenge, don’t look at the time given to you to solve the problem but just try to complete it and submit.

If you submit early, just enjoy the remaining time to learn and prepare for the next level.

For feedback, you can tweet or comment in Kaggle here, or in Reddit.

*Originally published at* *https://kirankamath.netlify.app**.*

July 18, 2020

**#Blog18**

I have written this as Kaggle Public Notebook if you have any feedback comment there.

The problem I have considered is Multi Label classification. In addition to having multiple labels in each image, the other challenge in this problem is the existence of rare classes and combinations of different classes. So in this situation normal split or random split doesnt work because you can end up putting rare cases in the validation set and your model will never learn about them. The stratification present in the scikit-learn is also not equipped to deal with multilabel targets.

I have specifically chosen this problem because we may learn some techniques on the way, which we otherwise would not have thought of.

**There may be better or easy way of doing kfold cross validation but I have done it keeping in mind how to implement using fastai**, so if you know some better way so please mail or tweet the idea, i will try to implement and give you credit.

I am using fastai2 so import that.

```
! pip install -q fastai2</span>
```

Cross-validation, how I see it, is the idea of minimizing randomness from one split by makings n folds, each fold containing train and validation splits. You train the model on each fold, so you have n models. Then you take average predictions from all models, which supposedly give us more confidence in results. These we will see in following code. I found iterative-stratification package that provides scikit-learn compatible cross validators with stratification for multilabel data.

**My opinion**:

In my opinion it’s more important to make one right split, especially because CV takes n times more to train. Then why did I do it??

I wanted to explore classification using cross validation using fastai, which I didn’t find many resources to learn. So if I write this blog it may help people.

fastai has no cross validation split(may be) in their library to work like other functions they provide. It may be because cross validation takes time, so may be it not that useful.

But still in this condition I feel its worth exploring using fastai.

so what is **stratification**??

The splitting of data into folds may be governed by criteria such as ensuring that each fold has the same proportion of observations with a given categorical value, such as the class outcome value. This is called stratified cross-validation

```
from fastai2.vision.all import *
from iterstrat.ml_stratifiers import MultilabelStratifiedKFold</span>
```

Here dataset is of Zero to GANs — Human Protein Classification inclass jovian.ml hosted competition

```
path = Path('../input/jovian-pytorch-z2g/Human protein atlas')
train_df = pd.read_csv(path/'train.csv')
train_df['Image'] = train_df['Image'].apply(str) + ".png"
train_df['Image'] = "../input/jovian-pytorch-z2g/Human protein atlas/train/" + train_df['Image']
train_df.head()</span>
```

The method I use here is if we have column called fold and with fold number it would be helpful to split data using that.

fastai has IndexSplitter in datablock api so this would be helpful.

```
strat_kfold = MultilabelStratifiedKFold(n_splits=3, random_state=42, shuffle=True)
train_df['fold'] = -1
for i, (_, test_index) in enumerate(strat_kfold.split(train_df.Image.values, train_df.iloc[:,1:].values)):
train_df.iloc[test_index, -1] = i
train_df.head()</span><span id="4772" class="ed ii hb dg ij b ik io ip iq ir is im s in">train_df.fold.value_counts().plot.bar() ;</span>
```

now that data is in dataframe and also folds are also defined for cross validation, we will build dataloaders, for which we will use datablock.

If you want to learn how fastai datablock see my blog series Make code Simple with DataBlock api

we will create a function get_data to create dataloader.

get_data uses fold to split data to be used for cross validation using IndexSplitter. for multiLabel problem compared to single only extra thing to be done is to add MultiCategoryBlock in blocks, this is how fastai makes it easy to work.

```
def get_data(fold=0, size=224,bs=32):
return DataBlock(blocks=(ImageBlock,MultiCategoryBlock),
get_x=ColReader(0),
get_y=ColReader(1, label_delim=' '),
splitter=IndexSplitter(train_df[train_df.fold == fold].index),
item_tfms=[FlipItem(p=0.5),Resize(512,method='pad')],
batch_tfms=[*aug_transforms(size=size,do_flip=True, flip_vert=True, max_rotate=180.0, max_lighting=0.6,max_warp=0.1, p_affine=0.75, p_lighting=0.75,xtra_tfms=[RandomErasing(p=0.5,sh=0.1, min_aspect=0.2,max_count=2)]),Normalize],
).dataloaders(train_df, bs=bs)</span>
```

Since this is multi label problem normal accuracy function wont work, so we have accuracy_multi. fastai has this which we can directly use in metrics but I wanted to know how that works so took code of it.

```
def accuracy_multi(inp, targ, thresh=0.5, sigmoid=True):
"Compute accuracy when `inp` and `targ` are the same size."
if sigmoid: inp = inp.sigmoid()
return ((inp>thresh)==targ.bool()).float().mean()</span>
```

F_score is way of evaluation for this competition so used this.

```
def F_score(output, label, threshold=0.2, beta=1):
prob = output > threshold
label = label > threshold
TP = (prob & label).sum(1).float()
TN = ((~prob) & (~label)).sum(1).float()
FP = (prob & (~label)).sum(1).float()
FN = ((~prob) & label).sum(1).float()
precision = torch.mean(TP / (TP + FP + 1e-12))
recall = torch.mean(TP / (TP + FN + 1e-12))
F2 = (1 + beta**2) * precision * recall / (beta**2 * precision + recall + 1e-12)
return F2.mean(0)</span>
```

```
test_df = pd.read_csv('../input/jovian-pytorch-z2g/submission.csv')
tstpng = test_df.copy()
tstpng['Image'] = tstpng['Image'].apply(str) + ".png"
tstpng['Image'] = "../input/jovian-pytorch-z2g/Human protein atlas/test/" + tstpng['Image']
tstpng.head()</span>
```

I have used technique called mixup, its a data augmentation technique.

In fastai Mixup is callback, and this Callback is used to apply MixUp data augmentation to your training. to know more read this

I have tried this first time, but this technique didnot improve my result in this problem. It usually improves accuracy after 80 epochs but I have trained for 20 epoches. so there was no difference in accuracy without it. so you can ignore this.

But to know about how mixup works is good, I will separate blog on this, so follow my twitter for updates.

```
mixup = MixUp(0.3)</span>
```

gc is for garbage collection

```
import gc</span>
```

I have created 3 folds where I simply get the data from a particular fold, create a model, add metrics, I have used resnet34. And that’s the whole training process. I just trained model on each fold and saved predictions for the test set.

I have used a technique called progressive resizing.

this is very simple: start training using small images, and end training using large images. Spending most of the epochs training with small images, helps training complete much faster. Completing training using large images makes the final accuracy much higher. this approach is called progressive resizing.

we should use the `fine_tune`

method after we resize our images to get our model to learn to do something a little bit different from what it has learned to do before.

I have used `cbs=EarlyStoppingCallback(monitor='valid_loss')`

so that model doesnot overfit.

append all prediction to list so that we use it later.

I have run the model for less epochs to see code works and show result, or stopped model in between(it took so much time)

This method gave me F_score of `.77`

and accuracy of `>91%`

so you can try.

My Purpose here is to write blog and explain how to approach and how code works.

If GPU is out of memory delete learner and empty cuda cache done in last line of code.

```
all_preds = []
for i in range(3):
dls = get_data(i,256,64)
learn = cnn_learner(dls, resnet34, metrics=[partial(accuracy_multi, thresh=0.2),partial(F_score, threshold=0.2)],cbs=mixup).to_fp16()
learn.fit_one_cycle(10, cbs=EarlyStoppingCallback(monitor='valid_loss'))
learn.dls = get_data(i,512,32)
learn.fine_tune(10,cbs=EarlyStoppingCallback(monitor='valid_loss'))
tst_dl = learn.dls.test_dl(tstpng)
preds, _ = learn.get_preds(dl=tst_dl)
all_preds.append(preds)
del learn
torch.cuda.empty_cache()
gc.collect()</span>
```

stack all the prediction stored in list and average the values.

```
subm = pd.read_csv("../input/jovian-pytorch-z2g/submission.csv")
preds = np.mean(np.stack(all_preds), axis=0)</span>
```

You should have list of labels which we get using vocab.

```
preds[0]</span>
```

I found threshold of 0.2 works good for my code.

```
thresh=0.2
labelled_preds = [' '.join([k[i] for i,p in enumerate(pred) if p > thresh]) for pred in preds]</span>
```

then all the labels predicted above 0.2 are labels of that image using vocab.

put them in Labels column

```
test_df['Label']=labelled_preds</span>
```

this step is to submit result to kaggle.

```
test_df.to_csv( 'submission.csv' , index = False )</span>
```

I have written as Kaggle Public Notebook if you like please upvote.

Thank you for reading:)

*Originally published at* *https://kirankamath.netlify.app**.*

I wrote this blog month ago while learning from jovian.ml PyTorch course. Had to write and submit a blog of 5 pytorch functions I like, as the first assignment. It was fun.

So let's get started.

These are the 5 functions I found interesting to write.

- torch.clamp()
- torch.argmax()
- torch.where()
- torch.from_numpy
- torch.matmul()

```
# Import torch and other required modules
import torch
```

Clamp all elements in input into the range [ min, max ] and return tensor

```
# Example 1 - working
i = torch.randn(7)
torch.clamp(i, min=-0.5, max=0.5)
```

```
tensor([ 0.3991, -0.5000, 0.4462, -0.4993, 0.5000, 0.5000, 0.2525])
```

This function retricts minimum and maximum value of elements in tensor.\ in above example we have put minimum as -0.5 and maximum as 0.5

```
# Example 2 - working
torch.clamp(i, min=0.0)
```

```
tensor([0.3991, 0.0000, 0.4462, 0.0000, 1.6489, 0.9907, 0.2525])
```

The above is nothing but a function of relu. relu replaces minimum with zero same as the above example

```
# Example 3 - breaking (to illustrate when it breaks)
torch.clamp(i)
```

```
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-16-bf99f0e86d17> in <module>
1 # Example 3 - breaking (to illustrate when it breaks)
----> 2 torch.clamp(i)
RuntimeError: At least one of 'min' or 'max' must not be None
```

this needs a range without which function breaks

Relu can be implemented using clamp function

returns indices of maximum value of all elements in input tensor\ this function is simple but can be used in finding accuracy for single label classification by passing prediction\ but before using manual_seed is important to set seed and get same output everytime, it is useful while demonstrating

```
# Example 1 - working (change this)
torch.manual_seed(49)
k=torch.randn(7,7)
print(k)
torch.argmax(k)
```

```
tensor([[ 0.2705, -0.3641, 0.5421, 0.1219, 0.5471, -1.1156, 0.5146],
[ 0.5792, -0.1513, -0.7178, 0.5251, 2.2830, 0.0806, 1.1384],
[-0.5584, -0.4422, 0.0927, 0.1392, -0.9433, 0.6335, -0.2762],
[-0.7085, -0.8226, -0.2340, 0.3303, 1.0855, 0.5016, -0.8041],
[ 1.6240, 1.5190, -1.2851, -2.4165, -0.3303, 0.6343, -1.5740],
[-0.7344, -0.2683, -0.3083, 0.8369, 0.6258, 1.2411, -1.2252],
[ 0.3188, 0.6634, 0.2450, 0.1627, 0.8132, 0.2792, -0.2150]])
tensor(11)
```

returns the index of maximum value in tensor k. here index of maximum value 1.5190 is 11

```
# Example 2 - working
print(torch.argmax(k,dim=1))
print(torch.argmax(k,dim=1,keepdim=True))
torch.argmax(k,dim=0,keepdim=True)
```

```
tensor([4, 4, 5, 4, 0, 5, 4])
tensor([[4],
[4],
[5],
[4],
[0],
[5],
[4]])
tensor([[4, 4, 0, 5, 1, 5, 1]])
```

here dim=1 means max value is calculated along dim 1 that is row, if dim=0 along column\ keepdim means whether the output tensor has dim retained or not.

```
# Example 3 - breaking (to illustrate when it breaks)
torch.argmax(torch.tensor([[4.,6.],[8.,10.,12.],[14.,16.]]))
```

```
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-7-e55fa7412cd2> in <module>
1 # Example 3 - breaking (to illustrate when it breaks)
----> 2 torch.argmax(torch.tensor([[4.,6.],[8.,10.,12.],[14.,16.]]))
ValueError: expected sequence of length 2 at dim 1 (got 3)
```

argmax gives error when we give invalid tensor, or tensor of not dimention dimention

This function is used to find maximum value index\ This function is simple but can be used in finding accuracy for single label classification by passing prediction to function. In final layer when it gives probabilities, this function can be used.

returns tensor of elements selected depending on condition provided. This function is mostly helpful. for example if we have to remove all negative values in tensor with 0 as in case of relu we can use where.

```
# Example 1 - working
x = torch.randn(2, 6)
y = torch.zeros(2, 6)
torch.where(x > 0, x, y)
```

```
tensor([[0.0000, 2.0267, 0.1806, 0.0040, 0.0000, 0.3850],
[0.1064, 0.0000, 0.0000, 0.1939, 0.0000, 0.1403]])
```

negetive values in tensor x is replaced with 0 from tensor y.

```
# Example 2 - working
x = torch.Tensor([1., 2, 3, 4, 7])
torch.where(x == 7, torch.Tensor([0]), x)
```

```
tensor([1., 2., 3., 4., 0.])
```

we can use where function to replace a perticular value in tensor

```
# Example 3 - breaking (to illustrate when it breaks)
x = torch.Tensor([1., 2, 3, 4, 7])
torch.where(x == 7, 0, x)
```

```
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-11-d9404139b323> in <module>
2 x = torch.Tensor([1., 2, 3, 4, 7])
3
----> 4 torch.where(x == 7, 0, x)
TypeError: where(): argument 'input' (position 2) must be Tensor, not int
```

position 2 is tensor, but we have passed int, so failed. so pass tensor in argument

can create tensor from numpy array. which is mostly useful to run tensor on gpu

```
# Example 1 - working
import numpy as np
r = np.array([1,2,3,4,5,6])
a = torch.from_numpy(r)
print(a)
type(a)
```

```
tensor([1, 2, 3, 4, 5, 6])
torch.Tensor
```

converts numpy array to tensor

```
# Example 2 - working
l = np.array([111,12,13,14,15,16])
m = torch.from_numpy(l)
print(l)
type(m)
```

```
[111 12 13 14 15 16]
torch.Tensor
```

the need of this funtion is to convert from numpy nd array to tensor

```
# Example 3 - breaking (to illustrate when it breaks)
g = [7,7,7]
h = torch.from_numpy(g)
```

```
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-22-f48bcc71eb26> in <module>
1 # Example 3 - breaking (to illustrate when it breaks)
2 g = [7,7,7]
----> 3 h = torch.from_numpy(g)
TypeError: expected np.ndarray (got list)
```

given anything else then ndarray this function throws error.

the need of this funtion is to convert from numpy nd array to tensor. but this mostly usefull because using numpy we can do operation in cpu and when need convert to tensor and run on gpu

this is obviously useful and much need function in neural nets. this function returns matrix product of tensors.

```
# Example 1 - working
x = torch.randn(2, 2)
y = torch.randn(2, 5)
torch.matmul(x, y)
```

```
tensor([[-0.5585, -0.4268, -0.1464, 0.2729, -0.4232],
[-0.1942, 2.7557, 1.4192, -0.5796, 0.8051]])
```

mat mul of 2_2 and 2_5 to give 2*5

```
# Example 2 - working
x = torch.randn(3, 4, 4)
y = torch.randn(3,4,4)
torch.matmul(x,y)
```

```
tensor([[[ 3.2719, 0.6905, 0.2168, -1.3025],
[ 0.8261, -0.0656, -0.1460, -1.4672],
[-0.3187, -2.0030, -2.8979, 2.2226],
[-2.9364, -0.5917, -1.0406, -0.4551]],
[[ 0.7714, 0.6768, -3.8320, 0.4671],
[-0.1203, -3.0586, 2.9324, -2.5221],
[-0.6227, 0.0743, 0.9032, 0.0845],
[-1.5827, 2.8724, 8.0832, 1.3175]],
[[-1.2030, 1.9033, 0.7302, -1.7191],
[-2.9976, -4.6867, -2.1513, -1.4554],
[-0.4289, 1.3620, 0.5602, 0.9934],
[-1.7558, -2.1634, 0.2784, 0.0987]]])
```

this example is shown because we will have tensors of dimention channel_weidth_breadth and to multiply those is useful

```
# Example 3 - breaking (to illustrate when it breaks)
y = torch.randn(3, 1)
torch.matmul(x, y)
```

```
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-32-75ff4103ae09> in <module>
1 # Example 3 - breaking (to illustrate when it breaks)
2 y = torch.randn(3, 1)
----> 3 torch.matmul(x, y)
RuntimeError: size mismatch, m1: [12 x 4], m2: [3 x 1] at /opt/conda/conda-bld/pytorch_1587428266983/work/aten/src/TH/generic/THTensorMath.cpp:41
```

column of matrix 1 should match with row of matrix 2 or throws error.

matrix multiplication is main operation in nueral nets. so this is definatly in my list

##

credits: PyTorch official docs, Jovian.ml

Provide links to your references and other interesting articles about tensors

- Official documentation for
`torch.Tensor`

: https://pytorch.org/docs/stable/tensors.html

July 11, 2020

This is part two of my series Make code Simple with DataBlock API. If you have not read part one, see here. In it, we have seen what DataBlock as a whole means, what are the bricks that are put together to make datablock. In last blog, I had stopped showing code so, in this blog, we’ll be diving into code.

Welcome!!!

Let us dive directly into code

```
!pip install -q fastai2</span>
```

Depending on type of problem, blocks in datablock changes and rest are same. Lets see example first,

```
from fastai2.vision.all import *</span>
```

```
camvid = DataBlock(blocks=(ImageBlock, MaskBlock(codes = np.loadtxt(path/'codes.txt', dtype=str))),
get_items=get_image_files,
splitter=RandomSplitter(),
get_y=lambda o: path/'labels'/f'{o.stem}_P{o.suffix}',
item_tfms = Resize(224),
batch_tfms=aug_transforms())</span><span id="81ff" class="de he hf ef hg b db if ig ih ii ij hi w hj">dls = camvid.dataloaders(path/"images") dls.show_batch()</span>
```

what did do different from above example which is different from part1 blog, just the `MaskBlock`

in blocks, so thats how simple creating datablock is.

Depending on tasks blocks = () change. Block’s are used to help nest transforms inside of pre-defined problem domain.

So there are different types of blocks.

- ImageBlock is used if the dataset is of images
- CategoryBlock is for single-label categorical targets
- MultiCategoryBlock is for multi-label categorical targets
- RegressionBlock is for float targets
- MaskBlock for segmentation masks, potentially with codes
- PointBlock is for points in an image
- BBoxBlock is for bounding boxes in an image
- BBoxLblBlock is for labeled bounding boxes, potentially with vocab

So depending on type of domain these blocks can be used.

So coming to our example The MaskBlock is generated with the codes that give the correspondence between pixel value of the masks and the object they correspond to.

Now lets take multi label classification problem

```
df = pd.read_csv(path2/'train.csv') df.head()</span>
```

```
pascal = DataBlock(blocks=(ImageBlock, MultiCategoryBlock),
splitter=ColSplitter('is_valid'),
get_x=ColReader('fname', pref=str(path2/'train') + os.path.sep),
get_y=ColReader('labels', label_delim=' '),
item_tfms = [FlipItem(p=0.5),Resize(224,method='pad')],
batch_tfms=[*aug_transforms(do_flip=True, flip_vert=True, max_rotate=180.0, max_lighting=0.6,max_warp=0.1, p_affine=0.75, p_lighting=0.75,xtra_tfms=[RandomErasing(p=1.0,sh=0.1, min_aspect=0.2,max_count=2)]),Normalize])</span><span id="de26" class="de he hf ef hg b db if ig ih ii ij hi w hj">dls2 = pascal.dataloaders(df)</span><span id="1bf1" class="de he hf ef hg b db if ig ih ii ij hi w hj">dls2.show_batch()</span>
```

Basic principles remain same, depending on domain blocks are used.

Now if we see splitters in example 1 we used RandomSplitter because we did not had any rule how to split the data, but that is not the case in example 2, we have column called is_valid in df so depending on that we need to split, so used ColSplitter(‘is_valid’)

so I assume, you understood how splitter works???

**Think of example where you had column of folds and depending on that you need to split for k fold cross validation, then how would you split the dataset**???

This is shown with code in my Kaggle kernel

get_x and get_y are easy and in my above kernel it is also explained,

Now I’ll move to item_tfms and batch_tfms

Observe item_tfms and batch_tfms in example2

I should not have applied those many like flip_vert because in this case it makes no sense but it is applied to show you there are lot of transforms and we can use it.

even if you dont write fastai default applied few transforms and that is beauty of fastai

```
item_tfms = [FlipItem(p=0.5),Resize(224,method='pad')]</span>
```

what does this mean, as we have seen in blog part1 that item transforms are applied on cpu, so speed is normal so we dont apply lot of transforms here, only the basic transforms are used. flip with probability of 0.5 is applied, then resizing is applied where images are converted to 224x224 and that is done with method of padding.

```
batch_tfms=[*aug_transforms(do_flip=True, flip_vert=True, max_rotate=180.0, max_lighting=0.6,max_warp=0.1, p_affine=0.75, p_lighting=0.75,xtra_tfms=[RandomErasing(p=1.0,sh=0.1, min_aspect=0.2,max_count=2)]),Normalize])</span>
```

batch transforms are applied on GPU so this is faster. I have used many here, just to show how it works.

In example2 show_batch you see lot of erased boxes that is because of RandomErasing, you can vary propability and this increase accuracy, there is callback cutmix which uses similar but complicated techniques.

Normalize is used without imagenet stats, now normalize is done based on mean and sd of that batch.

`aug_transforms`

is utility function to easily create a list of flip, rotate, zoom, warp, lighting transforms.

Random flip with p=0.5 is added when do_flip=True. With p_affine we apply a random rotation of max_rotate degrees, a random zoom between min_zoom and max_zoom and a perspective warping of max_warp. With p_lighting we apply a change in brightness and contrast of max_lighting. Custom xtra_tfms can be added.

So this is it:)

I assume you have understood introductory knowledge about datablock.

It is actually easy but needs practise and getting used to it, you can create dataloaders using datablock api very quickly.

You can practice in google colab here

Credit: fastai

Thank you for giving your time:)

*Originally published at* *https://kirankamath.netlify.app**.*

July 03, 2020

We have all used Object-Oriented Programming knowingly or otherwise. If you are gathering skills to be a data scientist then OOP is also an important topic to learn.

Have you wondered how famous packages which we use like scikit-learn works(how it is built)???

What does it mean when we use

```
import pandas as pd
pd.read_csv()</span>
```

What is pandas and where did this read_csv() come from and can it be used without creating such function ourselves???

Then this blog is for you.

Let’s get started. Welcome:)

We actually use concepts of OOPs

so what is OOP??

Object-oriented programming is a style of writing programs using classes and objects.

object-oriented programming allows you to create large, modular programs that can easily expand over time.

object-oriented programs hide the implementation from the end-user. When you train a machine learning algorithm with Scikit-learn, you don’t have to know anything about how the algorithms work or how they were coded. You can focus directly on the modeling. If the implementation changes, you as a user of the package might not ever find out.

A Python package does not need to use object-oriented programming. You could simply have a Python module with a set of functions. However, most if not all of the popular Python packages take advantage of object-oriented programming because:

- Object-oriented programs are relatively easy to expand especially because of inheritance
- Object-oriented programs obscure functionality from the user.

Objects are defined by attributes and methods.

Think objects as things that exist in the real world.\ For example, If we take Restaurant

- Restaurant is itself object
- food dishes is object
- waiter is also object

So if the waiter is a object what are its attribute(characteristics) and methods(actions)???

Attributes of waiter are name, address, phone number, salary

Methods of waiter are taking order, hike in salary, and serving dishes.

So now we know about objects, but object of two waiter may have different values which mean there are two objects but the attributes and methods are the same, which means both have common attribute type but with different values which means two objects has a common blueprint.

So the blueprint of an object is called class. And using this blueprint we can create many objects.

So let’s see the code of the class.

```
class Waiter:
"""The waiter class represents an type of person who takes order and serve dishes in a restaurant
"""
def __init__(self, name, address, height, salary):
"""Method for initializing a Pants object
Args:
name (str)
address (str)
height (int)
salary (float)
Attributes:
name (str): name of waiter object
address (str): address of a waiter object
height (int): height of a waiter object
salary (float): salary of a waiter object
"""
self.name = name
self.address = address
self.height = height
self.salary = salary
def hike_salary(self, hike_percent):
"""The hike_salary method changes the salary attribute of a waiter object
Args:
hike_percent (float): the new salary of the waiter object
Returns: None
"""
sal = self.salary + (hike_percent*self_salary)
self.salary = sal</span>
```

what is self?? I have used self many times in code.

It is used to pass values to attributes and differentiates between these two objects.

Self tells Python where to look in the computer’s memory for the a object. And then Python changes the value of that object. When you call the particular method, self is implicitly passed in.

Did you notice methods that look similar to functions?? A function and a method look very similar. They both use the def keyword. They also have inputs and return outputs. The difference is that a method is inside of a class whereas a function is outside of a class.

This code tells about OOPs and also the use of docstrings.

Dunder or magic methods in Python are the methods having two prefix and suffix underscores in the method name. Dunder here means “Double Under (Underscores)”. These are commonly used for operator overloading. wait!! did you see those in the above code?? yeah true, **init** is dunder method, it is to override the default behavior.

Now let’s look into Gaussian distribution class and understand how to use dunder.

```
class Gaussian():
""" Gaussian distribution class for calculating and
visualizing a Gaussian distribution.
"""
def __init__(self, mu = 0, sigma = 1):
self.mean = mu
self.stdev = sigma
self.data = []
def calculate_mean(self):
self.mean = 1.0 * sum(self.data) /len(self.data)
return self.mean
def calculate_stdev(self, sample=True):
if sample:
n = len(self.data) - 1
else:
n = len(self.data)
mean = self.mean
sigma = 0
for d in self.data:
sigma += (d - mean) ** 2
sigma = math.sqrt(sigma / n)
self.stdev = sigma
return self.stdev</span>
```

So how to add two gaussian distribution, if you see mathametical explanation it seems easy but how to do it in code if you try below you get error.

```
gaus_a + gaus_b = Yes we get error!!!</span>
```

Now comes dunder methods.

There is a dunder called **add** method of a Python class which will help to add two instances of a custom object. This means that we can control the result of a sum of two objects by modifying or defying the **add** method.

If you add this code inside the above Gaussian class then some magic happens.

```
def __add__(self, other):
result = Gaussian()
result.mean = self.mean + other.mean
result.stdev = math.sqrt(self.stdev**2 + other.stdev**2)
return result</span>
```

Now the code which gave you error before would work fine :)

This way we can rewrite code and change all the default behavior isn’t this useful???

In the restaurant example, we saw that food dish is an object which means that all other food dishes will have separate classes but why to take the trouble to build separate classes for everything, could there be a better way???

so the concept of inheritance helps here.

we could have a general class called food dish and have all attributes that are common to all food items and inherit that class for the different food dishes. Now, what if you want to add the attribute to all food dishes called seasonal, instead of adding it to all we can add it to the main root class and all other classes will inherit from it. This saves a lot of time and effort.

To inherit a class we need to write a general main class and use that class name inside the parenthesis in the child class.

```
class Gaussian(Distribution):</span>
```

Distribution is main class and Gaussian(child) uses the Distribution class.

In this blog, we have seen all basics of OOP. Remember at the start of blog I said famous packages will use OOP concepts to build packages, so learning of OOP is incomplete unless you apply your knowledge on how to use OOP.

So in the next blog, we will see how to create a package with python using OOP concepts and upload it to PyPI, after which you can use the package you created using pip install.

**Credit:** Udacity course.

Thank you for reading this blog:)

*Originally published at* *https://kirankamath.netlify.app**.*

June 14, 2020

Welcome!!!

This blog written with the purpose of introducing you to fastai’s awesome datablock api. This is first part of blog and the part 2 will be code approach.

Even though fastai follows top down approach, I am writing this first part of blog with no code, and theoretical explanation which sets motive to learn datablock with code in part2.(in course Jeremy gives motive and awesome explanation to why use it, before code, So I find that important before writing code.)

So lets start!!!

If you have used any deep learning framework( I use PyTorch so speak w.r.t. it) to build a model to solve a deep learning problem, you go through steps of collecting the data, what type of problem is it(like image classification, segmentation ), see what are dependent and independent variables, how to split the data into training and validation set, apply transforms to improve accuracy.

And in that process you may also have written lengthy code to all these task, but what if I tell you, you can do it in one single block then it would awesome(You can also do all that in normal way and refactor it but this datablock approach looks good to me, since I do less error while following this)

So What is **Data Block** api???

Data block api is high level api in fastai. The data block API is an expressive API for data loading. It is way to systematically define all of the steps necessary to prepare data for a deep learning model, and give users a mix and match recipe book for combining these pieces (which we refer to as data blocks)

Think of the DataBlock as a list of instructions to do when we’re building batches and our DataLoaders. It doesn’t need any items explicitly to be done, and instead is a blueprint of how to operate. Writing a DataBlock is just like writing a blueprint.

We just now saw a word DataLoaders. Let us see about that. PyTorch and fastai have two main classes for representing and accessing a training set or validation set:

`Dataset`

:: A collection that returns a tuple of your independent and dependent variable for a single item

`DataLoader`

:: An iterator that provides a stream of mini-batches, where each mini-batch is a couple of a batch of independent variables and a batch of dependent variables

Interesting is that fastai provides two classes for bringing your training and validation sets together:

`Datasets`

:: An object that contains a training Dataset and a validation Dataset

`DataLoaders`

:: An object that contains a training DataLoader and a validation DataLoader.

fastai library has a easy way of building DataLoaders such that it is simple enough for someone with minimal coding knowledge to get it, and also advanced enough to allow for exploration.

There are steps for creating datablock lets see that.

The steps are defined by the data block API that can be asked as questions while seeing data:

- what is the types of your inputs/targets? (
`Blocks`

) - where is your data? (
`get_items`

) - does something need to be applied to inputs? (
`get_x`

) - does something need to be applied to the target? (
`get_y`

) - how to split the data? (
`splitter`

) - do we need to apply something on formed items? (
`item_tfms`

) - do we need to apply something on formed batches? (
`batch_tfms`

)

This is it!!

while you answer these question you write a datablock.

You can treat each question or steps as brick that build fastai datablock.

- Blocks
- get_items
- get_x/get_y
- splitter
- item_tfms
- batch_tfms

Looking at dataset is very important while building dataloaders. And using datablock api is the strategy to solve problem or way of approach. First thing to look is how data is stored, that is in which format or in which manner and compare to famous dataset, whether it is stored in that way and how to approach it.

**Blocks** here is used to define pre-defined problem domain. For example if it’s an image problem I can tell the library to use Pillow without explicitly saying it. And say if it is single label or multi label classification. There are many like ImageBlock, CategoryBlock, MultiCategoryBlock, MaskBlock, PointBlock, BBoxBlock, BBoxLblBlock, TextBlock and so on. ( I will explain all code related details in part2 of blog)

**get_items** is to answer where is the data?

For example, in image problem, we can use `get_image_files`

function to go grab all the file locations of our images and we can look at the data( I will explain all code related details in part2 of blog).

**get_x** is to answer does something need to be applied to inputs?

**get_y** is how do you extract labels.

**splitter** is how do you want to split our data. Usually this is random split between training and validation dataset.

The remaining two bricks of datablock api is item_tfms and batch_tfms which is augmentation.

**item_tfms** is item transform applied on individual item basis. This is done on CPU.

**batch_tfms** is batch transform applied on batches of data. This is done in GPU.

Using these bricks in datablock we can approach and build dataloaders ready for different type of problems like classification, object detection, segmentation and all other different type of problems.

Data blocks API provides a good balance of conciseness and expressiveness. In the data science domain the scikit-learn pipeline approach is widely used. This API provides a very high level of expressivity, but is not opinionated enough to ensure that a user completes all of the steps necessary to get their data ready for modelling, but that is done in fastai data block api.

Now that we have seen, what is datablock api lets wrap everything and build one.

Its time!! lets see code(only datablock) for single label classification of Oxford IIIT pets dataset.

```
pets = DataBlock(blocks=(ImageBlock, CategoryBlock), get_items=get_image_files,
splitter=RandomSplitter(),
get_y=Pipeline([attrgetter("name"), RegexLabeller(pat = r'^(.*)_\d+.jpg/span>)]),
item_tfms=Resize(128),
batch_tfms=aug_transforms())</span>
```

What is this Code???

Reminder: This is introduction part of blog,

Curious to know what is in the code, and how to write the code, then read part 2 of blog which will be published on Sunday, 21 June, 2020 10:30 am IST here.

Credits:

- fastai docs
- Thanks to Zach Mueller, blog on datablock api, please keep writing blog and making videos.
- fastai A layered api for deep learning paper

*Originally published at* *https://kirankamath.netlify.app**.*

June 05, 2020

Photo by Markus Spiske on Unsplash

Inspiration for writing this blog: fastai course( Jeremy says regular expression is an important tool to consider learning. After completing the first part of course, I felt like writing a blog on this, but forgot. I should have written this blog earlier, but remembered about this topic when I was going through fastai v2)

check fastai v2

Regular expression is a sequence of characters mainly used to find and replace patterns in a string or file.

Lets discuss problem that can be solved using regular expression. (example is from fastai course)

While solving Deep Learning problems, we have dataset and there may be times when label is stored in file name. So that we will have path and need to extract label from it. Or there may be situation you need to extract information from website. In these or similar situation, regular expression is important tool.

Lets get started with easy example first.

Think that you have document and you want to search names of all people with first name ‘Kiran’ (last name can be anything), how to do it?? here regular expressions comes into play.

regular expression: ‘ **Kiran\s\w+\s’**

Here \s means a space and \w means character + means 1 or more characters. This extracts all names with first name Kiran along with last name.

Lets see example where label is in file name path:

`data/oxford-iiit-pet/images/american_bulldog_146.jpg`

`data/oxford-iiit-pet/images/german_shorthaired_137.jpg`

american_bulldog is label of that image. But how to extract it???

Writing regular expression is similar the way we approach the problem. seeing the example above we can tell that label is found after last forward slash(/) and after label we have number and path is ending with `.jpg`

format

Regular expression is **/([^/]+)_\d+.jpg$**

I’ll explain step by step.

**$** means end of text we are interpreting

. **jpg** is make sure that just before end of text we have jpg that is of right format.

**\d** means numeric digits and + means many digits.

**_** is underscore appearing before numbers

**([^/]+)** is for looking a group of characters that do not contain forward slash, and [ ] means character we are interested. ‘**^**’ is negation.

**forward slash** at the beginning is to tell our search ends when we hit forward slash.

**/([^/]+)_\d+.jpg$** gives us label we want i.e `american_bulldog`

in our example.

`python code`

```
import re string = 'data/oxford-iiit-pet/images/american_bulldog_146.jpg'
pat = r'([^/]+)_\d+.jpg/span>
pat = re.compile(pat)
print(pat.search(string).group(1))</span> <span id="abd0" class="ed ih ii dg ie b ij in io ip iq ir il s im">>american_bulldog</span>
```

Important Regular expression cheat sheet:

```
^ Start of string
$ End of string
\b Word boundary
* 0 or more
+ 1 or more
? 0 or 1
\s White space
\S Not white space
\d Digit
\D Not digit
\w Word
\W Not word
\ Escape following character
\n New line
\t Tab
. Any character except new line (\n)
\[a-z] Lower case letter from a to z
\[A-Z] Upper case letter from A to Z
(a|b) a or b
\[abc] Range (a or b or c)
\[^abc] Not (a or b or c)
\[0-7] Digit from 0 to 7</span>
```

I have explained regular expression with just two example but the purpose was to introduce you to regular expression and what it can do. This blog is written to introduce you to power of regular expressions. Regular expression if learnt how to use, can be important tool in your data science toolbox.

Thank you for reading the blog.

*Originally published at* *https://kirankamath.netlify.app**.*

May 30, 2020

We have looked into Jacobian matrix, element wise operation, derivatives involving single expressions and vector sum reduction and chain rule in previous blog. Please through that, Blog 1, Blog 2.

Let us compute the derivative of a typical neuron activation for a single neural network computation unit with respect to the model parameters, *w* and b

*activation*( **x**) = max(0, **w** . **x** + b)

This represents neuron with fully connected weights and relu. Let us compute derivative of ( **w** . **x** + b ) wrt **w** and b dot product **w . x** is the summation of element-wise multiplication of elements. Partial derivative of sum( **w** ⊗ **x**) can be calculated using chain rule using intermediate vector variable

The above image use of the max(0, z) function call on scalar z just says to treat all negative z values as 0. The derivative of the max function is a piecewise function. When z ≤ 0, the derivative is 0 because z is a constant. When z > 0, the derivative of the max function is just the derivative of z, which is 1.

When the activation function clips affine function output z to 0, the derivative is zero with respect to any weight w i . When z > 0, it’s as if the max function disappears and we get just the derivative of z with respect to the weights.

Training a neuron requires that we take the derivative of our loss or “cost” function with respect to the parameters of our model, **w** and b

We need to calculate gradient wrt weights and bias

Let X = [x 1 , x 2 , … , xN ] T (T means transpose)

If the error is 0, then the gradient is zero and we have arrived at the minimum loss. If ei is some small positive difference, the gradient is a small step in the direction of x . If e 1 is large, the gradient is a large step in that direction. we want to reduce, not increase, the loss, which is why the gradient descent recurrence relation takes the negative of the gradient to update the current position.

Look at things like the shape of a vector (long or tall), is the variable scalar or vector, the dimensions of a matrix. Vectors are represented by bold letters.After reading this blog please read the paper, to get more understanding.

Paper has unique way of explaining concepts, that is from simple to complex. when we reach the end of the paper, we would solve ourself, because difficult expressions can be solved, since we have a deep understanding of simple expressions.

First, we start with the functions of simple parameters represented by f(x). Second, we move to the functions of the form f(x,y,z). To calculate the derivatives of such functions, we use partial derivatives which are calculated with respect to specific parameters. Thirdly we move to the scalar function of a vector of input parameters as f( **x**), wherein the partial derivatives of f( **x**) are represented as vectors. Lastly, we see **f**( **x**) to represent a set of scalar functions of the form f( **x**).

This is the last part of the blog, part3.

Thank you.

*Originally published at* *https://kirankamath.netlify.app**.*

The paper is beginner-friendly, but I wanted to write this blog to note down points which would make it easier to understand the paper much better. As we learn some topics which are slightly difficult, we find it to explain to a beginner, in a way we learnt, who may not know anything in that field, so this blog is for beginner.

Deep Learning is all about linear algebra and calculus. If you try to read any deep learning paper, matrics calculus is a needed component to understanding the concept. May be word *need* may not be the right word to use, since Jeremy's courses show how to become a world-class deep learning practitioner with only a minimal level of calculus,
check fast.ai for courses.

I have written my understanding of paper in form of three blogs. This is part1 and check this website for two more parts. Deep learning is the basically use of neurons with many layers. what does each neuron do??

Each neuron applies a function on input and gives an output. The activation of a single computation unit in a neural network is typically calculated using the dot product of an edge weight vector **w** with an input vector **x** plus a scalar bias (threshold):

z(x) = **w** · **x** + b
letters written bold are vectors. **w** is vector
Function *z(x)* is called the unit's affine function and is followed by a rectified linear unit, which clips negative values to zero: max(0, z(x)). This computation takes place in neurons. Neural networks consist of many of these units, organized into multiple collections of neurons called layers. The activation of one layer's units becomes the input to the next layer's units. Math becomes simple when inputs, weights, and functions are treated as vectors, and the flow of values can be treated as matrix operations.
The most important math used here is *differentiation*, calculating the rate of change and optimizing the loss function to decrease error is the main purpose. Training phase is all about choosing weights **w** and bias *b* so that we get the desired output for all N inputs **x**. To do that, we minimize a loss function. To minimize the loss, we use SGD. Measuring how the output changes with respect to a change in weight is the same as calculating the (partial) derivative of the output w.r.t weight **w**. All of those require the partial derivative (the gradient) of activation(x) with respect to the model parameters **w** and *b*. Our goal is to gradually tweak **w** and *b* so that the overall loss function keeps getting smaller across all **x** inputs.

Basic rules needed during to solve problem

Neural networks are functions of multiple parameters so let's discuss that.
What is derivative of xy(multiply x and y) ??
Well, it depends on whether we are changing with respect to x or y. we compute derivatives with respect to one variable at a time, giving two derivates in this case, we call partial derivatives. δ symbol is used instead of *d* to represent that.
The partial derivative with respect to *x* is just the usual scalar derivative, simply treating any other variable in the equation as a constant.

we will see how to calculate the gradient of f(x,y)

The gradient of f(x,y) is simply a vector of its partials.\Gradient vectors organize all of the partial derivatives for a specific scalar function. If we have two functions, we can also organize their gradients into a matrix by stacking the gradients. When we do so, we get the Jacobian matrix where the gradients are rows.
To define the Jacobian matrix more generally, let's combine multiple parameters into a single vector argument: f (x, y, z) ⇒ f (**x**)
Let **y** = f (**x**) be a vector of m scalar-valued functions that each take a vector **x**\of length n = |x| where |x| is the cardinality (count) of elements in x. Each f of i is a function within f returns a scalar\for example f (x, y) = 3x²y and g(x, y) = 2x + y⁸ from the last section as
y₁ = f₁(x) = 3x²₁x₂
y₂ = f₂(x) = 2x₁ + x⁸₂
Jacobian matrix is the collection of all m × n possible partial derivatives (m rows and n columns), which is the stack of m gradients with respect to **x.**
The Jacobian of the identity function **f**(**x**) = **x**, with fi (**x**) = x i , has n functions and each function has n parameters held in a single vector **x**. The Jacobian is, therefore, a square matrix since m = n

Element wise operations are important to know in deep learning. By Element-wise binary operations we simply mean applying an operator to the first item of each vector to get the first item of the output, then to the second items of the inputs for the second item of the output, and so forth. We can generalize the element-wise binary operations with notation y = f (w) O g(x) where m = n = |y| = |w| = |x|

When we multiply or add scalars to vectors, we're implicitly expanding the scalar to a vector and then performing an element-wise binary operation. For example

(The notation -> 1 represents a vector of ones of appropriate length.) z is any scalar that doesn't depend on **x**, which is useful because then ∂z/∂x= 0 for any x i and that will simplify our partial derivative computations

Summing up the elements of a vector is an important operation in deep learning, such as the network loss function.\Let y = sum(**f** (**x**)) . Notice we were careful here to leave the parameter as a vector **x** because each function f i could use all values in the vector, not just x i . The sum is over the results of the function and not the parameter.

In gradient of the simple y = sum(**x**)= [1,1 ....1].
Because ∂x i/ ∂x j = 0 for j != i. Transpose because we have assumed default as vertical vectors. It’s very important to keep the shape of all of your vectors and matrices in order otherwise it’s impossible to compute the derivatives of complex functions.

**#Blog 9**

Links that helped me:Paper The Matrix Calculus You Need For Deep Learning by Terence Parr and Jeremy Howard.\Blog by Nikhil B
This is **part 1** of the blog, In blog 2, I will explain chain rule.

May 29, 2020

We can’t compute partial derivatives of very complicated functions using just the basic matrix calculus rules we’ve seen Blog part 1. For example, we can’t take the derivative of nested expressions like sum( **w** + **x**) directly without reducing it to its scalar equivalent. We need to be able to combine our basic vector rules using the vector chain rule.

In paper they have defined and named three different chain rules.

- single-variable chain rule
- single-variable total-derivative chain rule
- vector chain rule

The chain rule comes into play when we need the derivative of an expression composed of nested subexpressions. Chain rule helps in solving problem by breaking complicated expressions into subexpression whose derivatives are easy to compute.

Chain rules are defined in terms of nested functions such as *y=f(g(x))* for single variable chain rule.

Formula is

dy/dx = (dy/du) (du/dx)

There are 4 steps to solve using single variable chain rule

- Introduce intermediate variable
- compute derivatives of intermediate variables wrt(with respect to) their parameters.
- combine all derivatives by multiplying them together
- substitute intermediate variables back in derivative equation.

Lets see example of nested equation y = f (x) = *ln*(sin(x³ ) ² )

It is to compute the derivatives of the intermediate variables in isolation!

But single variable chain rule is applicable only when a single variable can influence output in only one way. As we see in example we can handle nested expression of single variable *x* using this chain ruleonly when x can effect y through single data flow path.

If we apply single variable chain rule to **y = f (x) = x + x²** we get wrong answer, because derivative operator doesnot apply to multivariate functions. change in x in the equation , affects y both as operand og addition and as operand of square. so we clearly cant apply single variable chain rule. so…

we move to total derivatives.

which is to compute (dy/dx) , we need to sum up all possible contributions from changes in x to the change in y.

Formula for total derivative chain rule

Total derivative assumes all variables are potentially co-dependent where as partial derivative assumes all variables but *x* are constants.

when you take the total derivative with respect to x, other variables might also be functions of *x* so add in their contributions as well. The left side of the equation looks like a typical partial derivative but the right-hand side is actually the total derivative.

Lets see example,

total derivative formula always *sums*, that is sums up terms in the derivative. For example, given y = x × x² instead of y = x + x² , the total-derivative chain rule formula still adds partial derivative terms, for more detail see demonstration in paper.

Formula of total derivative can be simplified further.

This chain rule that takes into consideration the total derivative degenerates to the single-variable chain rule when all intermediate variables are functions of a single variable.

derivative of a sample vector function with respect to a scalar, **y** = **f** (x).

introduce two intermediate variables, g 1 and g 2 , one for each f i so that y looks more like **y** = **f** ( **g**(x))

If we split the terms, isolating the terms into a vector, we get a matrix by vector.

This completes chain rule. In next blog that is part3 we will see how we can apply this gradient of neural activation and loss function and wrap up.

Thank you.

Useful Points:

It is difficult while writing a blog in markdown to convert to superscript and subscript so I have listed down, which you can use ( copy-paste) in your markdown

superscript ⁰ ¹ ² ³ ⁴ ⁵ ⁶ ⁷ ⁸ ⁹ ᵃ ᵇ ᶜ ᵈ ᵉ ᶠ ᵍ ʰ ᶦ ʲ ᵏ ˡ ᵐ ⁿ ᵒ ᵖ ʳ ˢ ᵗ ᵘ ᵛ ʷ ˣ ʸ ᶻ

subscript ₀ ₁ ₂ ₃ ₄ ₅ ₆ ₇ ₈ ₉ ₐ ᵦ 𝒸 𝒹 ₑ 𝒻 𝓰 ₕ ᵢ ⱼ ₖ ₗ ₘ ₙ ₒ ₚ ᵩ ᵣ ₛ ₜ ᵤ ᵥ 𝓌 ₓ ᵧ 𝓏

*Originally published at* *https://kirankamath.netlify.app**.*

April 28, 2020

- Censored Data
- Kaplan-Meier Estimates

`lifelines`

is an open-source library for data analysis.`numpy`

is the fundamental package for scientific computing in python.`pandas`

is what we'll use to manipulate our data.`matplotlib`

is a plotting library.

```
import lifelines
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from lifelines import KaplanMeierFitter as KM
from lifelines.statistics import logrank_test</span>
```

```
from lifelines.datasets import load_lymphoma
def load_data():
df = load_lymphoma()
df.loc[:, 'Event'] = df.Censor
df = df.drop(['Censor'], axis=1)
return df</span><span id="3504" class="de im gr ef id b db iq ir is it iu io w ip">data = load_data()</span>
```

first look over your data.

```
data shape: (80, 3)</span>
```

The column `Time`

states how long the patient lived before they died or were censored.

The column `Event`

says whether a death was observed or not. `Event`

is 1 if the event is observed (i.e. the patient died) and 0 if data was censored.

Censorship here means that the observation has ended without any observed event. For example, let a patient be in a hospital for 100 days at most. If a patient dies after only 44 days, their event will be recorded as `Time = 44`

and `Event = 1`

. If a patient walks out after 100 days and dies 3 days later (103 days total), this event is not observed in our process and the corresponding row has `Time = 100`

and `Event = 0`

. If a patient survives for 25 years after being admitted, their data for are still `Time = 100`

and `Event = 0`

.

plot a histogram of the survival times to see in general how long cases survived before censorship or events.

```
data.Time.hist();
plt.xlabel("Observation time before death or censorship (days)");
plt.ylabel("Frequency (number of patients)");</span>
```

```
def frac_censored(df):
"""
Return percent of observations which were censored.
Args:
df (dataframe): dataframe which contains column 'Event' which is
1 if an event occurred (death)
0 if the event did not occur (censored)
Returns:
frac_censored (float): fraction of cases which were censored.
"""
result = 0.0
result=sum(df['Event']==0) / df.shape[0]
return result</span> <span id="ce02" class="de im gr ef id b db iq ir is it iu io w ip">print (frac_censored(data))</span><span id="5160" class="de im gr ef id b db iq ir is it iu io w ip">0.325</span>
```

see the distributions of survival times for censored and uncensored examples.

```
df_censored = data[data.Event == 0]
df_uncensored = data[data.Event == 1]
df_censored.Time.hist()
plt.title("Censored")
plt.xlabel("Time (days)")
plt.ylabel("Frequency")
plt.show()
df_uncensored.Time.hist()
plt.title("Uncensored")
plt.xlabel("Time (days)")
plt.ylabel("Frequency")
plt.show()</span>
```

estimate the survival function:

S(t) = P(T > t)

we’ll start with a naive estimator of the above survival function. To estimate this quantity, we’ll divide the number of people who we know lived past time $t$ by the number of people who were not censored before $t$.

Formally, let $i$ = 1, …, $n$ be the cases, and let $t_i$ be the time when $i$ was censored or an event happened. Let $e_i= 1$ if an event was observed for $i$ and 0 otherwise. Then let $X_t = {i : T_i > t}$, and let $M_t = {i : e_i = 1 \text{ or } T_i > t}$. The estimator you will compute will be:

$$ \hat{S}(t) = \frac{|X_t|}{|M_t|} $$

```
def naive_estimator(t, df):
"""
Return naive estimate for S(t), the probability
of surviving past time t. Given by number
of cases who survived past time t divided by the
number of cases who weren't censored before time t.
Args:
t (int): query time
df (dataframe): survival data. Has a Time column,
which says how long until that case
experienced an event or was censored,
and an Event column, which is 1 if an event
was observed and 0 otherwise.
Returns:
S_t (float): estimator for survival function evaluated at t.
"""
S_t = 0.0
S_t = (sum(df['Time']>t))/(sum((df['Time']>t)|(df['Event']==1)))
return S_t</span>
```

Check for some test cases to see if output is correct. To cross check manually calculate values and crosscheck

```
print("Test Cases")
sample_df = pd.DataFrame(columns = ["Time", "Event"])
sample_df.Time = [5, 10, 15]
sample_df.Event = [0, 1, 0]
print("Sample dataframe for testing code:")
print(sample_df)
print("\n")
print("Test Case 1: S(3)")
print("Output: {}\n".format(naive_estimator(3, sample_df)))
print("Test Case 2: S(12)")
print("Output: {}\n".format(naive_estimator(12, sample_df)))
print("Test Case 3: S(20)")
print("Output: {}\n".format(naive_estimator(20, sample_df)))
# Test case 4
sample_df = pd.DataFrame({'Time': [5,5,10],
'Event': [0,1,0]
})
print("Test case 4: S(5)")
print(f"Output: {naive_estimator(5, sample_df)}")</span><span id="65b5" class="de im gr ef id b db iq ir is it iu io w ip">Test Cases
Sample dataframe for testing code:
Time Event
0 5 0
1 10 1
2 15 0
Test Case 1: S(3)
Output: 1.0
Test Case 2: S(12)
Output: 0.5
Test Case 3: S(20)
Output: 0.0
Test case 4: S(5)
Output: 0.5</span>
```

We will plot the naive estimator using the real data up to the maximum time in the dataset.

```
max_time = data.Time.max()
x = range(0, max_time+1)
y = np.zeros(len(x))
for i, t in enumerate(x):
y[i] = naive_estimator(t, data)
plt.plot(x, y)
plt.title("Naive Survival Estimate")
plt.xlabel("Time")
plt.ylabel("Estimated cumulative survival rate")
plt.show()</span>
```

Next let’s compare this with the Kaplan Meier estimate.

Kaplan-Meier estimate:

$$ S(t) = \prod_{t_i \leq t} (1 — \frac{d_i}{n_i}) $$

where $t_i$ are the events observed in the dataset and $d_i$ is the number of deaths at time $t_i$ and $n_i$ is the number of people who we know have survived up to time $t_i$.

```
def HomemadeKM(df):
"""
Return KM estimate evaluated at every distinct
time (event or censored) recorded in the dataset.
Event times and probabilities should begin with
time 0 and probability 1.
Example:
input:
Time Censor
0 5 0
1 10 1
2 15 0
correct output:
event_times: [0, 5, 10, 15]
S: [1.0, 1.0, 0.5, 0.5]
Args:
df (dataframe): dataframe which has columns for Time
and Event, defined as usual.
Returns:
event_times (list of ints): array of unique event times
(begins with 0).
S (list of floats): array of survival probabilites, so that
S[i] = P(T > event_times[i]). This
begins with 1.0 (since no one dies at time
0).
"""
# individuals are considered to have survival probability 1
# at time 0
event_times = [0]
p = 1.0
S = [p]
# get collection of unique observed event times
observed_event_times = list(df.Time.unique())
# sort event times
observed_event_times = sorted(observed_event_times)
# iterate through event times
for t in observed_event_times:
# compute n_t, number of people who survive to time t
n_t = sum((df['Time'] >= t))
# compute d_t, number of people who die at time t
d_t = sum((df['Event'] == 1) & (df['Time'] == t))
# update p
p = ((n_t-d_t)/n_t)*p
# update S and event_times
event_times.append(t)
S.append(p)
return event_times, S</span><span id="54b2" class="de im gr ef id b db iq ir is it iu io w ip">print("TEST CASES:\n")
print("Test Case 1\n")
print("Test DataFrame:")
sample_df = pd.DataFrame(columns = ["Time", "Event"])
sample_df.Time = [5, 10, 15]
sample_df.Event = [0, 1, 0]
print(sample_df.head())
print("\nOutput:")
x, y = HomemadeKM(sample_df)
print("Event times: {}, Survival Probabilities: {}".format(x, y))
print("\nTest Case 2\n")
print("Test DataFrame:")
sample_df = pd.DataFrame(columns = ["Time", "Event"])
sample_df.loc[:, "Time"] = [2, 15, 12, 10, 20]
sample_df.loc[:, "Event"] = [0, 0, 1, 1, 1]
print(sample_df.head())
print("\nOutput:")
x, y = HomemadeKM(sample_df)
print("Event times: {}, Survival Probabilities: {}".format(x, y))</span><span id="8b24" class="de im gr ef id b db iq ir is it iu io w ip">TEST CASES:
Test Case 1
Test DataFrame:
Time Event
0 5 0
1 10 1
2 15 0
Output:
Event times: [0, 5, 10, 15], Survival Probabilities: [1.0, 1.0, 0.5, 0.5]
Test Case 2
Test DataFrame:
Time Event
0 2 0
1 15 0
2 12 1
3 10 1
4 20 1
Output:
Event times: [0, 2, 10, 12, 15, 20], Survival Probabilities: [1.0, 1.0, 0.75, 0.5, 0.5, 0.0]</span>
```

Now let’s plot the two against each other on the data to see the difference.

```
max_time = data.Time.max()
x = range(0, max_time+1)
y = np.zeros(len(x))
for i, t in enumerate(x):
y[i] = naive_estimator(t, data)
plt.plot(x, y, label="Naive")
x, y = HomemadeKM(data)
plt.step(x, y, label="Kaplan-Meier")
plt.xlabel("Time")
plt.ylabel("Survival probability estimate")
plt.legend()
plt.show()</span>
```

We see that along with Time and Censor, we have a column called `Stage_group`

.

- A value of 1 in this column denotes a patient with stage III cancer
- A value of 2 denotes stage IV.

We want to compare the survival functions of these two groups.

This time we’ll use the `KaplanMeierFitter`

class from `lifelines`

. Run the next cell to fit and plot the Kaplan Meier curves for each group.

```
S1 = data[data.Stage_group == 1]
km1 = KM()
km1.fit(S1.loc[:, 'Time'], event_observed = S1.loc[:, 'Event'], label = 'Stage III')
S2 = data[data.Stage_group == 2]
km2 = KM()
km2.fit(S2.loc[:, "Time"], event_observed = S2.loc[:, 'Event'], label = 'Stage IV')
ax = km1.plot(ci_show=False)
km2.plot(ax = ax, ci_show=False)
plt.xlabel('time')
plt.ylabel('Survival probability estimate')
plt.savefig('two_km_curves', dpi=300)</span>
```

Let’s compare the survival functions at 90, 180, 270, and 360 days

```
survivals = pd.DataFrame([90, 180, 270, 360], columns = ['time'])
survivals.loc[:, 'Group 1'] = km1.survival_function_at_times(survivals['time']).values
survivals.loc[:, 'Group 2'] = km2.survival_function_at_times(survivals['time']).values</span>
```

This makes clear the difference in survival between the Stage III and IV cancer groups in the dataset.

To say whether there is a statistical difference between the survival curves we can run the log-rank test. This test tells us the probability that we could observe this data if the two curves were the same. The derivation of the log-rank test is somewhat complicated, but luckily `lifelines`

has a simple function to compute it.

```
def logrank_p_value(group_1_data, group_2_data):
result = logrank_test(group_1_data.Time, group_2_data.Time,
group_1_data.Event, group_2_data.Event)
return result.p_value
logrank_p_value(S1, S2)</span><span id="624f" class="de im gr ef id b db iq ir is it iu io w ip">0.009588929834755544</span>
```

p value of less than `0.05`

, which indicates that the difference in the curves is indeed statistically significant.

Credits: Coursera Ai in medicine course

detailed nice font blog in https://kirankamath.netlify.app/blog/survival-estimates-lymphoma-patients/

*Originally published at* *https://kirankamath.netlify.app**.*

We are going to use the Oxford 102 Flower Dataset by Nilsback,M-E and Zisserman, A., 2008. A 102 category dataset consisting of 102 flower categories, commonly occuring in the United Kingdom. Each class consists of 40 to 258 images. The images have large scale, pose and light variations.

Credits: fastai(for inspiring to do this project and ofcourse Im using fastai library)

To understand details about all basic code see my 1st Project

```
!curl -s https://course.fast.ai/setup/colab | bash
```

```
Updating fastai...
Done.
```

```
%reload_ext autoreload
%autoreload 2
%matplotlib inline
```

```
from fastai import *
from fastai.basics import *
from fastai.vision import *
from fastai.metrics import *
```

```
from fastai.callbacks.hooks import *
from fastai.utils.mem import *
```

```
path=untar_data(URLs.FLOWERS)
```

```
Downloading https://s3.amazonaws.com/fast-ai-imageclas/oxford-102-flowers
```

```
path.ls()
```

```
[PosixPath('/root/.fastai/data/oxford-102-flowers/test.txt'),
PosixPath('/root/.fastai/data/oxford-102-flowers/valid.txt'),
PosixPath('/root/.fastai/data/oxford-102-flowers/train.txt'),
PosixPath('/root/.fastai/data/oxford-102-flowers/jpg')]
```

```
path_img=path/'jpg'
```

This dataset is has train test and validation data in text file and images in jpg file. So using pandas dataframe looks good here

```
trn=pd.read_csv(path/'train.txt',sep=" ",header=None)
val=pd.read_csv(path/'valid.txt',sep=" ",header=None)
tst=pd.read_csv(path/'test.txt',sep=" ",header=None)
df = trn.append(val,ignore_index=True).append(tst,ignore_index=True)
df.columns=['Img','Class']
df.index=df.Img
df.head()
```

```
Img Class
Img
jpg/image_03860.jpg jpg/image_03860.jpg 16
jpg/image_06092.jpg jpg/image_06092.jpg 13
jpg/image_02400.jpg jpg/image_02400.jpg 42
jpg/image_02852.jpg jpg/image_02852.jpg 55
jpg/image_07710.jpg jpg/image_07710.jpg 96
```

```
len(trn), len(val), len(tst)
```

```
(1020, 1020, 6149)
```

We have images with format jpg/{some name}.jpg and class label. But we dont know the name of flower. So to know name of flower, we have 2 steps.

- create a dictionary and add it manually seeing dataset website, or use content.json available with label and flower name
- Use web scrapping to extract images name.

2nd option looks good, since it opportunity to learn web scrapping. I am using BeautifulSoup

```
from bs4 import BeautifulSoup
import re
```

```
url = 'http://www.robots.ox.ac.uk/~vgg/data/flowers/102/categories.html'
r= requests.get(url)
soup=BeautifulSoup(r.content, "lxml")
ims=soup.findAll('img')
print(ims)
```

```
[<img alt="alpine sea holly" border="0" height="75" src="thumbs/thumbim_06974.jpg" width="78"/>, <img alt="buttercup" border="0" height="76" src="thumbs/thumbim_04657.jpg" width="75"/>, <img alt="fire lily" border="0" height="75" src="thumbs/thumbim_06779.jpg" width="75"/>, <img alt="anthurium" border="0" height="75" src="thumbs/thumbim_02011.jpg" width="76"/>, <img alt="californian poppy" border="0" height="75" src="thumbs/thumbim_03206.jpg" width="76"/>, <img alt="foxglove" border="0" height="75" src="thumbs/thumbim_07419.jpg" width="80"/>, <img alt="artichoke" border="0" height="75" src="thumbs/thumbim_04093.jpg" width="80"/>, <img alt="camellia" border="0" height="75" src="thumbs/thumbim_07652.jpg" width="75"/>, <img alt="frangipani" border="0" height="76" src="thumbs/thumbim_00784.jpg" width="75"/>, <img alt="azalea" border="0" height="75" src="thumbs/thumbim_03581.jpg" width="75"/>, <img alt="canna lily" border="0" height="75" src="thumbs/thumbim_04479.jpg" width="75"/>, <img alt="fritillary" border="0" height="75" src="thumbs/thumbim_03379.jpg" width="75"/>, <img alt="ball moss" border="0" height="75" src="thumbs/thumbim_06024.jpg" width="75"/>, <img alt="canterbury bells" border="0" height="77" src="thumbs/thumbim_06618.jpg" width="75"/>, <img alt="garden phlox" border="0" height="75" src="thumbs/thumbim_05594.jpg" width="75"/>, <img alt="balloon flower" border="0" height="76" src="thumbs/thumbim_06156.jpg" width="75"/>, <img alt="cape flower" border="0" height="76" src="thumbs/thumbim_03806.jpg" width="75"/>, <img alt="gaura" border="0" height="75" src="thumbs/thumbim_08149.jpg" width="78"/>, <img alt="barbeton daisy" border="0" height="75" src="thumbs/thumbim_02211.jpg" width="75"/>, <img alt="carnation" border="0" height="75" src="thumbs/thumbim_06923.jpg" width="75"/>, <img alt="gazania" border="0" height="75" src="thumbs/thumbim_04481.jpg" width="75"/>, <img alt="bearded iris" border="0" height="75" src="thumbs/thumbim_05933.jpg" width="75"/>, <img alt="cautleya spicata" border="0" height="75" src="thumbs/thumbim_06249.jpg" width="75"/>, <img alt="geranium" border="0" height="76" src="thumbs/thumbim_02747.jpg" width="75"/>, <img alt="bee balm" border="0" height="76" src="thumbs/thumbim_03070.jpg" width="75"/>, <img alt="clematis" border="0" height="75" src="thumbs/thumbim_01668.jpg" width="75"/>, <img alt="giant white arum lily" border="0" height="76" src="thumbs/thumbim_04902.jpg" width="76"/>, <img alt="bird of paradise" border="0" height="75" src="thumbs/thumbim_03291.jpg" width="75"/>, <img alt="colt's foot" border="0" height="75" src="thumbs/thumbim_04019.jpg" width="81"/>, <img alt="globe thistle" border="0" height="78" src="thumbs/thumbim_07115.jpg" width="75"/>, <img alt="bishop of llandaff" border="0" height="75" src="thumbs/thumbim_02779.jpg" width="76"/>, <img alt="columbine" border="0" height="75" src="thumbs/thumbim_02577.jpg" width="75"/>, <img alt="globe-flower" border="0" height="75" src="thumbs/thumbim_06677.jpg" width="75"/>, <img alt="black-eyed susan" border="0" height="75" src="thumbs/thumbim_05890.jpg" width="75"/>, <img alt="common dandelion" border="0" height="76" src="thumbs/thumbim_06317.jpg" width="75"/>, <img alt="grape hyacinth" border="0" height="75" src="thumbs/thumbim_06601.jpg" width="75"/>, <img alt="blackberry lily" border="0" height="75" src="thumbs/thumbim_08035.jpg" width="75"/>, <img alt="corn poppy" border="0" height="77" src="thumbs/thumbim_06492.jpg" width="75"/>, <img alt="great masterwort" border="0" height="77" src="thumbs/thumbim_05819.jpg" width="75"/>, <img alt="blanket flower" border="0" height="75" src="thumbs/thumbim_07899.jpg" width="75"/>, <img alt="cyclamen " border="0" height="75" src="thumbs/thumbim_00590.jpg" width="77"/>, <img alt="hard-leaved pocket orchid" border="0" height="75" src="thumbs/thumbim_05101.jpg" width="75"/>, <img alt="bolero deep blue" border="0" height="75" src="thumbs/thumbim_07131.jpg" width="75"/>, <img alt="daffodil" border="0" height="75" src="thumbs/thumbim_05707.jpg" width="75"/>, <img alt="hibiscus" border="0" height="76" src="thumbs/thumbim_01761.jpg" width="75"/>, <img alt="bougainvillea" border="0" height="75" src="thumbs/thumbim_07483.jpg" width="75"/>, <img alt="desert-rose" border="0" height="75" src="thumbs/thumbim_04815.jpg" width="75"/>, <img alt="hippeastrum " border="0" height="75" src="thumbs/thumbim_04868.jpg" width="75"/>, <img alt="bromelia" border="0" height="75" src="thumbs/thumbim_07858.jpg" width="75"/>, <img alt="english marigold" border="0" height="75" src="thumbs/thumbim_05147.jpg" width="75"/>, <img alt="japanese anemone" border="0" height="75" src="thumbs/thumbim_08157.jpg" width="75"/>, <img alt="king protea" border="0" height="75" src="thumbs/thumbim_05744.jpg" width="75"/>, <img alt="peruvian lily" border="0" height="75" src="thumbs/thumbim_04270.jpg" width="75"/>, <img alt="stemless gentian" border="0" height="75" src="thumbs/thumbim_05236.jpg" width="79"/>, <img alt="lenten rose" border="0" height="75" src="thumbs/thumbim_04564.jpg" width="77"/>, <img alt="petunia" border="0" height="75" src="thumbs/thumbim_01456.jpg" width="75"/>, <img alt="sunflower" border="0" height="75" src="thumbs/thumbim_05414.jpg" width="81"/>, <img alt="lotus" border="0" height="75" src="thumbs/thumbim_01863.jpg" width="75"/>, <img alt="pincushion flower" border="0" height="75" src="thumbs/thumbim_05360.jpg" width="75"/>, <img alt="sweet pea" border="0" height="75" src="thumbs/thumbim_05684.jpg" width="76"/>, <img alt="love in the mist" border="0" height="75" src="thumbs/thumbim_06444.jpg" width="82"/>, <img alt="pink primrose" border="0" height="75" src="thumbs/thumbim_06757.jpg" width="76"/>, <img alt="sweet william" border="0" height="77" src="thumbs/thumbim_03495.jpg" width="75"/>, <img alt="magnolia" border="0" height="75" src="thumbs/thumbim_05520.jpg" width="82"/>, <img alt="pink-yellow dahlia?" border="0" height="75" src="thumbs/thumbim_02947.jpg" width="76"/>, <img alt="sword lily" border="0" height="76" src="thumbs/thumbim_02392.jpg" width="75"/>, <img alt="mallow" border="0" height="75" src="thumbs/thumbim_07737.jpg" width="76"/>, <img alt="poinsettia" border="0" height="76" src="thumbs/thumbim_01511.jpg" width="76"/>, <img alt="thorn apple" border="0" height="76" src="thumbs/thumbim_02144.jpg" width="75"/>, <img alt="marigold" border="0" height="78" src="thumbs/thumbim_05010.jpg" width="75"/>, <img alt="primula" border="0" height="76" src="thumbs/thumbim_03679.jpg" width="75"/>, <img alt="tiger lily" border="0" height="77" src="thumbs/thumbim_07167.jpg" width="75"/>, <img alt="mexican aster" border="0" height="78" src="thumbs/thumbim_06945.jpg" width="75"/>, <img alt="prince of wales feathers" border="0" height="75" src="thumbs/thumbim_06884.jpg" width="78"/>, <img alt="toad lily" border="0" height="75" src="thumbs/thumbim_06694.jpg" width="77"/>, <img alt="mexican petunia" border="0" height="79" src="thumbs/thumbim_07779.jpg" width="75"/>, <img alt="purple coneflower" border="0" height="75" src="thumbs/thumbim_03886.jpg" width="77"/>, <img alt="tree mallow" border="0" height="75" src="thumbs/thumbim_02910.jpg" width="76"/>, <img alt="monkshood" border="0" height="94" src="thumbs/thumbim_06415.jpg" width="75"/>, <img alt="red ginger" border="0" height="76" src="thumbs/thumbim_06830.jpg" width="75"/>, <img alt="tree poppy" border="0" height="76" src="thumbs/thumbim_05323.jpg" width="75"/>, <img alt="moon orchid" border="0" height="75" src="thumbs/thumbim_07215.jpg" width="76"/>, <img alt="rose" border="0" height="77" src="thumbs/thumbim_01270.jpg" width="75"/>, <img alt="trumpet creeper" border="0" height="75" src="thumbs/thumbim_07949.jpg" width="76"/>, <img alt="morning glory" border="0" height="75" src="thumbs/thumbim_02451.jpg" width="78"/>, <img alt="ruby-lipped cattleya" border="0" height="75" src="thumbs/thumbim_04333.jpg" width="75"/>, <img alt="wallflower" border="0" height="75" src="thumbs/thumbim_00954.jpg" width="76"/>, <img alt="orange dahlia" border="0" height="94" src="thumbs/thumbim_05061.jpg" width="75"/>, <img alt="siam tulip" border="0" height="75" src="thumbs/thumbim_07011.jpg" width="77"/>, <img alt="water lily" border="0" height="75" src="thumbs/thumbim_00323.jpg" width="76"/>, <img alt="osteospermum" border="0" height="77" src="thumbs/thumbim_05571.jpg" width="75"/>, <img alt="silverbush" border="0" height="76" src="thumbs/thumbim_06126.jpg" width="75"/>, <img alt="watercress" border="0" height="76" src="thumbs/thumbim_00634.jpg" width="75"/>, <img alt="oxeye daisy" border="0" height="76" src="thumbs/thumbim_06224.jpg" width="75"/>, <img alt="snapdragon" border="0" height="75" src="thumbs/thumbim_03117.jpg" width="75"/>, <img alt="wild pansy" border="0" height="75" src="thumbs/thumbim_04194.jpg" width="76"/>, <img alt="passion flower" border="0" height="75" src="thumbs/thumbim_00005.jpg" width="75"/>, <img alt="spear thistle" border="0" height="75" src="thumbs/thumbim_06055.jpg" width="75"/>, <img alt="windflower" border="0" height="75" src="thumbs/thumbim_05982.jpg" width="76"/>, <img alt="pelargonium" border="0" height="75" src="thumbs/thumbim_04696.jpg" width="76"/>, <img alt="spring crocus" border="0" height="76" src="thumbs/thumbim_07052.jpg" width="75"/>, <img alt="yellow iris" border="0" height="75" src="thumbs/thumbim_06353.jpg" width="77"/>]
```

Observe the content of ims carefully. It has img tag, in which we are interested in src and alt tag. alt tag content label names, and src content details which we have in dataframe Img part. so using this we can change df such that jpg/{something}.jpg is equal to label names

```
sample = {}
for im in ims[0:]:
sample[f"jpg/image_{im['src'].split('_')[-1]}"] = im['alt']
len(sample.keys()), {k: sample[k] for k in list(sample)[:5]}
```

```
(102,
{'jpg/image_02011.jpg': 'anthurium',
'jpg/image_03206.jpg': 'californian poppy',
'jpg/image_04657.jpg': 'buttercup',
'jpg/image_06779.jpg': 'fire lily',
'jpg/image_06974.jpg': 'alpine sea holly'})
```

We can see that its showing 102 keys means that we have scraped 102 image lables. There are 102 labels.

But actually we want dictionary that contains class label number to name mapping, so lets do that

```
names = {}
for im in sample.keys():
names[df.loc[im]['Class']]=sample[im]
{k: names[k] for k in list(names)[:5]}
```

```
{20: 'fire lily',
34: 'alpine sea holly',
47: 'buttercup',
64: 'californian poppy',
79: 'anthurium'}
```

Go through the labels once

```
{k: names[k] for k in list(names)}
```

```
{0: 'pink primrose',
1: 'hard-leaved pocket orchid',
2: 'canterbury bells',
3: 'sweet pea',
4: 'english marigold',
5: 'tiger lily',
6: 'moon orchid',
7: 'bird of paradise',
8: 'monkshood',
9: 'globe thistle',
10: 'snapdragon',
11: "colt's foot",
12: 'king protea',
13: 'spear thistle',
14: 'yellow iris',
15: 'globe-flower',
16: 'purple coneflower',
17: 'peruvian lily',
18: 'balloon flower',
19: 'giant white arum lily',
20: 'fire lily',
21: 'pincushion flower',
22: 'fritillary',
23: 'red ginger',
24: 'grape hyacinth',
25: 'corn poppy',
26: 'prince of wales feathers',
27: 'stemless gentian',
28: 'artichoke',
29: 'sweet william',
30: 'carnation',
31: 'garden phlox',
32: 'love in the mist',
33: 'mexican aster',
34: 'alpine sea holly',
35: 'ruby-lipped cattleya',
36: 'cape flower',
37: 'great masterwort',
38: 'siam tulip',
39: 'lenten rose',
40: 'barbeton daisy',
41: 'daffodil',
42: 'sword lily',
43: 'poinsettia',
44: 'bolero deep blue',
45: 'wallflower',
46: 'marigold',
47: 'buttercup',
48: 'oxeye daisy',
49: 'common dandelion',
50: 'petunia',
51: 'wild pansy',
52: 'primula',
53: 'sunflower',
54: 'pelargonium',
55: 'bishop of llandaff',
56: 'gaura',
57: 'geranium',
58: 'orange dahlia',
59: 'pink-yellow dahlia?',
60: 'cautleya spicata',
61: 'japanese anemone',
62: 'black-eyed susan',
63: 'silverbush',
64: 'californian poppy',
65: 'osteospermum',
66: 'spring crocus',
67: 'bearded iris',
68: 'windflower',
69: 'tree poppy',
70: 'gazania',
71: 'azalea',
72: 'water lily',
73: 'rose',
74: 'thorn apple',
75: 'morning glory',
76: 'passion flower',
77: 'lotus',
78: 'toad lily',
79: 'anthurium',
80: 'frangipani',
81: 'clematis',
82: 'hibiscus',
83: 'columbine',
84: 'desert-rose',
85: 'tree mallow',
86: 'magnolia',
87: 'cyclamen ',
88: 'watercress',
89: 'canna lily',
90: 'hippeastrum ',
91: 'bee balm',
92: 'ball moss',
93: 'foxglove',
94: 'bougainvillea',
95: 'camellia',
96: 'mallow',
97: 'mexican petunia',
98: 'bromelia',
99: 'blanket flower',
100: 'trumpet creeper',
101: 'blackberry lily'}
```

```
codes = np.array([names[i] for i in range(len(names))]); codes
```

```
array(['pink primrose', 'hard-leaved pocket orchid', 'canterbury bells', 'sweet pea', ..., 'bromelia',
'blanket flower', 'trumpet creeper', 'blackberry lily'], dtype='<U25')
```

Here many labels contain space in between so lets remove that

```
trn[trn.columns[-1]] = trn[trn.columns[-1]].apply(lambda x: codes[x].replace(' ','_'))
val[val.columns[-1]] = val[val.columns[-1]].apply(lambda x: codes[x].replace(' ','_'))
tst[tst.columns[-1]] = tst[tst.columns[-1]].apply(lambda x: codes[x].replace(' ','_'))
```

```
trn[:5]
```

Once confirmed all labels doesnot contain space move forward

```
fnames= get_image_files(path_img)
fnames[:3]
```

```
[PosixPath('/root/.fastai/data/oxford-102-flowers/jpg/image_00529.jpg'),
PosixPath('/root/.fastai/data/oxford-102-flowers/jpg/image_00328.jpg'),
PosixPath('/root/.fastai/data/oxford-102-flowers/jpg/image_07095.jpg')]
```

```
img_f = fnames[0]
img=open_image(img_f)
img.show(figsize=(5,5))
```

```
src_size = min(img.size)
```

```
size=src_size//2
bs=4
```

```
trnList = ImageList.from_df(df=trn, path=path)
valList = ImageList.from_df(df=val, path=path)
tstList = ImageList.from_df(df=tst, path=path)
```

```
src = (ImageList.from_folder(path).split_by_list(trnList,valList).label_from_df())
```

To know more about the above steps see this because it would take lot to explain everything so linking to docs of fastai

```
data = (src.transform(get_transforms(),size=size).databunch(bs=bs).normalize(imagenet_stats))
```

```
data.show_batch(4, figsize=(10,7))
```

```
data.show_batch(4,figsize=(10,7),ds_type=DatasetType.Valid)
```

```
metrics=accuracy
```

```
learn=cnn_learner(data,models.resnet34,metrics=metrics)
```

```
Downloading: "https://download.pytorch.org/models/resnet34-333f7ec4.pth" to /root/.cache/torch/checkpoints/resnet34-333f7ec4.pth
```

Downloaded pretrained weights and using resnet34 which is trained on imagenet. You can try vgg, resnet50 also. Resnet 50 is best and can get better accuracy, I am not doing it, because of my space and time limitation

```
lr_find(learn)
learn.recorder.plot()
```

```
LR Finder is complete, type {learner_name}.recorder.plot() to see the graph.
```

```
lr=3e-3
```

```
learn.fit_one_cycle(10, slice(lr), pct_start=0.9)
```

```
epoch train_loss valid_loss accuracy time
0 6.156984 4.590524 0.046078 00:30
1 5.120844 3.424721 0.224510 00:30
2 3.638896 2.221143 0.484314 00:30
3 2.891678 1.523283 0.636275 00:30
4 2.325167 1.261948 0.686275 00:30
5 2.097249 1.228347 0.677451 00:30
6 1.986707 1.102566 0.711765 00:30
7 2.222347 1.115762 0.701961 00:30
8 1.909852 0.974875 0.743137 00:30
9 1.473234 0.843119 0.775490 00:31
```

```
learn.save('stage-1')
```

This is very important step since if wanting to experiment with different learning rate and other hyperparametes, its important to save so that we can load when needed

```
learn.load('stage-1')
```

```
Learner(data=ImageDataBunch;
Train: LabelList (1020 items)
x: ImageList
Image (3, 250, 250),Image (3, 250, 250),Image (3, 250, 250),Image (3, 250, 250),Image (3, 250, 250)
y: CategoryList
purple_coneflower,spear_thistle,sword_lily,bishop_of_llandaff,mallow
Path: /root/.fastai/data/oxford-102-flowers;
Valid: LabelList (1020 items)
x: ImageList
Image (3, 250, 250),Image (3, 250, 250),Image (3, 250, 250),Image (3, 250, 250),Image (3, 250, 250)
y: CategoryList
canna_lily,bolero_deep_blue,english_marigold,alpine_sea_holly,anthurium
Path: /root/.fastai/data/oxford-102-flowers;
Test: None, model=Sequential(
(0): Sequential(
(0): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
(1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU(inplace=True)
(3): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
(4): Sequential(
(0): BasicBlock(
(conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(1): BasicBlock(
(conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(2): BasicBlock(
(conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(5): Sequential(
(0): BasicBlock(
(conv1): Conv2d(64, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(downsample): Sequential(
(0): Conv2d(64, 128, kernel_size=(1, 1), stride=(2, 2), bias=False)
(1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(1): BasicBlock(
(conv1): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(2): BasicBlock(
(conv1): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(3): BasicBlock(
(conv1): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(6): Sequential(
(0): BasicBlock(
(conv1): Conv2d(128, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(downsample): Sequential(
(0): Conv2d(128, 256, kernel_size=(1, 1), stride=(2, 2), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(1): BasicBlock(
(conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(2): BasicBlock(
(conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(3): BasicBlock(
(conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(4): BasicBlock(
(conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(5): BasicBlock(
(conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(7): Sequential(
(0): BasicBlock(
(conv1): Conv2d(256, 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(downsample): Sequential(
(0): Conv2d(256, 512, kernel_size=(1, 1), stride=(2, 2), bias=False)
(1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(1): BasicBlock(
(conv1): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(2): BasicBlock(
(conv1): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
)
(1): Sequential(
(0): AdaptiveConcatPool2d(
(ap): AdaptiveAvgPool2d(output_size=1)
(mp): AdaptiveMaxPool2d(output_size=1)
)
(1): Flatten()
(2): BatchNorm1d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(3): Dropout(p=0.25, inplace=False)
(4): Linear(in_features=1024, out_features=512, bias=True)
(5): ReLU(inplace=True)
(6): BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(7): Dropout(p=0.5, inplace=False)
(8): Linear(in_features=512, out_features=102, bias=True)
)
), opt_func=functools.partial(<class 'torch.optim.adam.Adam'>, betas=(0.9, 0.99)), loss_func=FlattenedLoss of CrossEntropyLoss(), metrics=[<function accuracy at 0x7fcf35abed90>], true_wd=True, bn_wd=True, wd=0.01, train_bn=True, path=PosixPath('/root/.fastai/data/oxford-102-flowers'), model_dir='models', callback_fns=[functools.partial(<class 'fastai.basic_train.Recorder'>, add_time=True, silent=False)], callbacks=[], layer_groups=[Sequential(
(0): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
(1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU(inplace=True)
(3): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
(4): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(5): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(6): ReLU(inplace=True)
(7): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(8): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(9): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(10): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(11): ReLU(inplace=True)
(12): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(13): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(14): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(15): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(16): ReLU(inplace=True)
(17): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(18): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(19): Conv2d(64, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(20): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(21): ReLU(inplace=True)
(22): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(23): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(24): Conv2d(64, 128, kernel_size=(1, 1), stride=(2, 2), bias=False)
(25): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(26): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(27): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(28): ReLU(inplace=True)
(29): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(30): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(31): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(32): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(33): ReLU(inplace=True)
(34): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(35): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(36): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(37): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(38): ReLU(inplace=True)
(39): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(40): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
), Sequential(
(0): Conv2d(128, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU(inplace=True)
(3): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(5): Conv2d(128, 256, kernel_size=(1, 1), stride=(2, 2), bias=False)
(6): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(7): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(8): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(9): ReLU(inplace=True)
(10): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(11): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(12): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(13): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(14): ReLU(inplace=True)
(15): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(16): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(17): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(18): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(19): ReLU(inplace=True)
(20): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(21): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(22): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(23): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(24): ReLU(inplace=True)
(25): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(26): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(27): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(28): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(29): ReLU(inplace=True)
(30): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(31): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(32): Conv2d(256, 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(33): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(34): ReLU(inplace=True)
(35): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(36): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(37): Conv2d(256, 512, kernel_size=(1, 1), stride=(2, 2), bias=False)
(38): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(39): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(40): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(41): ReLU(inplace=True)
(42): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(43): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(44): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(45): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(46): ReLU(inplace=True)
(47): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(48): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
), Sequential(
(0): AdaptiveAvgPool2d(output_size=1)
(1): AdaptiveMaxPool2d(output_size=1)
(2): Flatten()
(3): BatchNorm1d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(4): Dropout(p=0.25, inplace=False)
(5): Linear(in_features=1024, out_features=512, bias=True)
(6): ReLU(inplace=True)
(7): BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(8): Dropout(p=0.5, inplace=False)
(9): Linear(in_features=512, out_features=102, bias=True)
)], add_time=True, silent=False)
```

```
learn.show_results(rows=4,figsize=(8,9))
```

This step is important since it shows which images we have correctly classified and which we did not

```
learn.unfreeze()
```

Unfreezes entire model, sets every layer group to train.

```
lrs= slice(lr/400,lr/4)
```

```
learn.fit_one_cycle(12,lrs,pct_start=0.8)
```

```
learn.save('stage-2')
```

As you can see accuracy has increased compared to stage 1 so save this

```
learn.show_results(ds_type=DatasetType.Valid, rows=4,figsize=(8,9))
```

```
learn.export()
```

```
src2 = (ImageList.from_folder(path)
.split_by_list(trnList, tstList)
.label_from_df()
)
data2 = (src2.transform(get_transforms(), size=size)
.databunch(bs=bs)
.normalize(imagenet_stats))
learn2 = cnn_learner(data2, models.resnet34, metrics=metrics)
learn2.load('stage-2');
```

```
preds,y,losses = learn2.get_preds(with_loss=True)
accuracy(preds,y)
```

```
tensor(0.8452)
```

```
len(trn), len(val),len(tst)
```

```
(1020, 1020, 6149)
```

```
bs = 8
size = src_size
metrics=accuracy
```

```
src3 = (ImageList.from_df(df=trn.append(tst, ignore_index=True), path=path)
.split_by_rand_pct(valid_pct=0.2, seed=42)
.label_from_df()
)
data3 = (src3.transform(get_transforms(), size=size)
.databunch(bs=bs)
.normalize(imagenet_stats))
learn = cnn_learner(data3, models.resnet34, metrics=metrics)
```

```
learn.load('stage-2')
```

```
Learner(data=ImageDataBunch;
Train: LabelList (5736 items)
x: ImageList
Image (3, 500, 500),Image (3, 500, 500),Image (3, 500, 500),Image (3, 500, 500),Image (3, 500, 500)
y: CategoryList
purple_coneflower,spear_thistle,sword_lily,bishop_of_llandaff,mallow
Path: /root/.fastai/data/oxford-102-flowers;
Valid: LabelList (1433 items)
x: ImageList
Image (3, 500, 500),Image (3, 500, 500),Image (3, 500, 500),Image (3, 500, 500),Image (3, 500, 500)
y: CategoryList
pink-yellow_dahlia?,daffodil,purple_coneflower,watercress,bolero_deep_blue
Path: /root/.fastai/data/oxford-102-flowers;
Test: None, model=Sequential(
(0): Sequential(
(0): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
(1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU(inplace=True)
(3): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
(4): Sequential(
(0): BasicBlock(
(conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(1): BasicBlock(
(conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(2): BasicBlock(
(conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(5): Sequential(
(0): BasicBlock(
(conv1): Conv2d(64, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(downsample): Sequential(
(0): Conv2d(64, 128, kernel_size=(1, 1), stride=(2, 2), bias=False)
(1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(1): BasicBlock(
(conv1): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(2): BasicBlock(
(conv1): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(3): BasicBlock(
(conv1): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(6): Sequential(
(0): BasicBlock(
(conv1): Conv2d(128, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(downsample): Sequential(
(0): Conv2d(128, 256, kernel_size=(1, 1), stride=(2, 2), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(1): BasicBlock(
(conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(2): BasicBlock(
(conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(3): BasicBlock(
(conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(4): BasicBlock(
(conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(5): BasicBlock(
(conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(7): Sequential(
(0): BasicBlock(
(conv1): Conv2d(256, 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(downsample): Sequential(
(0): Conv2d(256, 512, kernel_size=(1, 1), stride=(2, 2), bias=False)
(1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(1): BasicBlock(
(conv1): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(2): BasicBlock(
(conv1): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
)
(1): Sequential(
(0): AdaptiveConcatPool2d(
(ap): AdaptiveAvgPool2d(output_size=1)
(mp): AdaptiveMaxPool2d(output_size=1)
)
(1): Flatten()
(2): BatchNorm1d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(3): Dropout(p=0.25, inplace=False)
(4): Linear(in_features=1024, out_features=512, bias=True)
(5): ReLU(inplace=True)
(6): BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(7): Dropout(p=0.5, inplace=False)
(8): Linear(in_features=512, out_features=102, bias=True)
)
), opt_func=functools.partial(<class 'torch.optim.adam.Adam'>, betas=(0.9, 0.99)), loss_func=FlattenedLoss of CrossEntropyLoss(), metrics=[<function accuracy at 0x7fcf35abed90>], true_wd=True, bn_wd=True, wd=0.01, train_bn=True, path=PosixPath('/root/.fastai/data/oxford-102-flowers'), model_dir='models', callback_fns=[functools.partial(<class 'fastai.basic_train.Recorder'>, add_time=True, silent=False)], callbacks=[], layer_groups=[Sequential(
(0): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
(1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU(inplace=True)
(3): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
(4): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(5): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(6): ReLU(inplace=True)
(7): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(8): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(9): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(10): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(11): ReLU(inplace=True)
(12): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(13): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(14): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(15): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(16): ReLU(inplace=True)
(17): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(18): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(19): Conv2d(64, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(20): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(21): ReLU(inplace=True)
(22): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(23): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(24): Conv2d(64, 128, kernel_size=(1, 1), stride=(2, 2), bias=False)
(25): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(26): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(27): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(28): ReLU(inplace=True)
(29): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(30): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(31): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(32): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(33): ReLU(inplace=True)
(34): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(35): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(36): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(37): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(38): ReLU(inplace=True)
(39): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(40): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
), Sequential(
(0): Conv2d(128, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU(inplace=True)
(3): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(5): Conv2d(128, 256, kernel_size=(1, 1), stride=(2, 2), bias=False)
(6): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(7): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(8): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(9): ReLU(inplace=True)
(10): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(11): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(12): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(13): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(14): ReLU(inplace=True)
(15): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(16): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(17): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(18): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(19): ReLU(inplace=True)
(20): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(21): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(22): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(23): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(24): ReLU(inplace=True)
(25): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(26): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(27): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(28): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(29): ReLU(inplace=True)
(30): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(31): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(32): Conv2d(256, 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(33): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(34): ReLU(inplace=True)
(35): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(36): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(37): Conv2d(256, 512, kernel_size=(1, 1), stride=(2, 2), bias=False)
(38): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(39): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(40): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(41): ReLU(inplace=True)
(42): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(43): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(44): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(45): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(46): ReLU(inplace=True)
(47): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(48): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
), Sequential(
(0): AdaptiveAvgPool2d(output_size=1)
(1): AdaptiveMaxPool2d(output_size=1)
(2): Flatten()
(3): BatchNorm1d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(4): Dropout(p=0.25, inplace=False)
(5): Linear(in_features=1024, out_features=512, bias=True)
(6): ReLU(inplace=True)
(7): BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(8): Dropout(p=0.5, inplace=False)
(9): Linear(in_features=512, out_features=102, bias=True)
)], add_time=True, silent=False)
```

```
learn.unfreeze()
```

```
lr_find(learn)
learn.recorder.plot()
LR Finder is complete, type {learner_name}.recorder.plot() to see the graph.
```

```
lr=5e-4
```

```
lrs=slice(1e-5,lr/10)
```

```
learn.fit_one_cycle(10,lrs)
```

```
epoch train_loss valid_loss accuracy time
0 1.384226 0.477209 0.879274 04:34
1 0.818306 0.230950 0.943475 04:35
2 0.518546 0.167983 0.956036 04:35
3 0.387305 0.140578 0.968597 04:36
4 0.300006 0.115997 0.972087 04:33
5 0.240830 0.093787 0.981158 04:35
6 0.187486 0.090038 0.977669 04:37
7 0.109384 0.084203 0.983252 04:40
8 0.143494 0.085302 0.981158 04:37
9 0.151969 0.083969 0.981158 04:33
```

```
learn.save('stage-3')
```

```
learn.export()
```

```
src2 = (ImageList.from_folder(path)
.split_by_list(trnList, valList)
.label_from_df()
)
data2 = (src2.transform(get_transforms(), size=size)
.databunch(bs=bs)
.normalize(imagenet_stats))
learn2 = cnn_learner(data2, models.resnet34, metrics=metrics)
learn2.load('stage-3');
```

```
preds,y,losses = learn2.get_preds(with_loss=True)
accuracy(preds,y)
```

```
tensor(0.9784)
```

I got accuracy of 97.84%

]]>I was looking for a text editor that I could use in all my projects. I started using vs code, atom and many more, during which I didn't even consider vim. Out of many editors I like vs code which is awesome when considering all the others I used. I was working on something which used a lot of RAM of my laptop, so I was unable to use vs code( it uses a lot of memory) and I started looking for editors that used less memory and are powerful. From many articles, I was inspired by using vim. Now that I use vim(I also have vs code in the laptop, but use vim a lot) I consider it a powerful editor.

Let us see some basics of vim to get started and see how to be good with it.

Vim has a rich history, it originated from the Vi editor (1976), and it's still being developed today. Vim has some really neat ideas behind it, and for this reason, lots of tools support a Vim emulation mode. Vim is probably worth learning even if you finally end up switching to some other text editor.

What I love about vim is its different modes, even though it feels annoying at the beginning it will be a helpful and efficient way later on. When programming, you spend most of your time reading/editing, not writing. For this reason, Vim is a modal editor: it has different modes for inserting text vs manipulating text.

**Normal**: for moving around a file and making edits, mainly for reading.

**Insert**: for inserting text

**Replace**: for replacing text

**Visual** (plain, line, or block) mode: for selecting blocks of text

**Command-line**: for running a command

You change modes by pressing (the escape key) to switch from any mode back to normal mode. From normal mode, enter insert mode with i, replace mode with R, visual mode with v.

Command mode can be entered by typing **:**

**:q**quit (close window)**:qa**quit all open windows**:q!**quit without saving**:w**save ("write")**:wq**save and quit**:e**{name of file} open file for editing**:ls**show open buffers**:help**{topic} open help

You should spend most of your time in normal mode, using movement commands to navigate the buffer.

- Basic movement: hjkl (left, down, up, right)
- Words: w (next word), b (beginning of word), e (end of word)
- Scroll: Ctrl u (up), Ctrl d (down)
- Find: f{character},find

**i**enter insert mode but for manipulating/deleting text, want to use something more than backspace**o**/ O insert line below / above**d**e.g. dw is delete word**x**delete character (equal do dl)**u**to undo, ctrl r to redo*visual mode plus manipulation select text, d to delete it or c to change it copying block also works***/**will search a word example / find will find a word 'find'**.**repeats the last editing word that was used.- : sp and : vsp for split window of same file
- ctrl w to move between split windows

You can combine commands with a count, which will perform a given action a number of times. 4k will move (4 time k) that is move right 4 times. ci[ change the contents inside the current pair of square brackets You can use modifiers to change the meaning.

With these basics, you could easily get into using vim and after few days of using vim, you may also feel it as a powerful editor.

You can learn Vim by playing game like environment. Here

Credits:

- Wonderful website that is main inspiration for this blog. This is missing semester. Thank you Anish, Jose, and Jon for creating this. click here .
- Thanks to all the other text editors which helped me to make vim as my default editor. haha!!

```
!curl -s https://course.fast.ai/setup/colab | bash
```

```
Updating fastai...
Done.
```

If you run a script which creates/ downloads files, the files will NOT persist after the allocated instance is shutdown. To save files, you need to permit your Colaboratory instance to read and write files to your Google Drive. Add the following code snippet at the beginning of every notebook

```
# from google.colab import drive
# drive.mount('/content/gdrive', force_remount=True)
# root_dir = "/content/gdrive/My Drive/"
# base_dir = root_dir + 'fastai-v3/'
```

The lines in jupyter notebook that starts with ‘%’ are called Line Magics. These are not instructions for Python to execute, but to Jupyter notebook.

```
%reload_ext autoreload
%autoreload 2
%matplotlib inline
```

The ** reload_ext autoreload** reloads modules automatically before entering the execution of code typed at the IPython prompt.

The next line ** autoreload 2** imports all modules before executing the typed code.

The next line is to plot the graphs inside the jupyter notebook. We use matplotlib inline.

We import all the necessary packages. We are going to work with the fastai V1 library which sits on top of Pytorch 1.0. The fastai library provides many useful functions that enable us to quickly and easily build neural networks and train our models.

```
from fastai import *
from fastai.vision import *
from fastai.metrics import *
```

We are going to use the Oxford-IIIT Pet Dataset by O. M. Parkhi et al., 2012 which features 12 cat breeds and 25 dogs breeds. Our model will need to learn to differentiate between these 37 distinct categories. Use the untar_data function to which we must pass a URL as an argument and which will download and extract the data

```
help(untar_data)
```

```
Help on function untar_data in module fastai.datasets:
untar_data(url:str, fname:Union[pathlib.Path, str]=None, dest:Union[pathlib.Path, str]=None, data=True, force_download=False) -> pathlib.Path
Download `url` to `fname` if `dest` doesn't exist, and un-tgz to folder `dest`.
```

```
path=untar_data(URLs.PETS)
path
```

```
PosixPath('/content/data/oxford-iiit-pet')
```

Best part of jupyter notebook is last line of cell will be printed, so instead of print(path) we can write path.

```
path.ls()
```

```
[PosixPath('/content/data/oxford-iiit-pet/annotations'),
PosixPath('/content/data/oxford-iiit-pet/images')]
```

Python 3 has the notation **/** which is useful to navigate into the directory as in the actual directory. We use it to create Path variables with the new location.

```
path_anno=path/'annotations'
path_img=path/'images'
```

The first thing we do when we approach a problem is to take a look at the data. We always need to understand very well what the problem is and what the data looks like before we can figure out how to solve it.

```
fnames = get_image_files(path_img)
fnames[:5]
```

```
[PosixPath('/content/data/oxford-iiit-pet/images/chihuahua_66.jpg'),
PosixPath('/content/data/oxford-iiit-pet/images/scottish_terrier_129.jpg'),
PosixPath('/content/data/oxford-iiit-pet/images/British_Shorthair_59.jpg'),
PosixPath('/content/data/oxford-iiit-pet/images/Russian_Blue_172.jpg'),
PosixPath('/content/data/oxford-iiit-pet/images/pomeranian_1.jpg')]
```

Fortunately, the fastai library has a handy function made exactly for this, **ImageDataBunch.from_name_re** gets the labels from the filenames using a regular expression.
Detailed explanation of Regular expression is given in this post I found.
Regular expressions understanding is very important.

```
np.random.seed(2)
pat=r'/([^/]+)_\d+.jpg$'
```

**ImageDataBunch** is used to do classification based on images. We use the method **from_name_re** to represent that the name of the classification is to be got from the name of the file using a regular expression.
size argument here is the size to which the image is to be resized. This is usually a square image and 224 is used most of time.
Normalizarion is done which includes changing the range of values of RGB from 0-255 to -1 to 1.

```
data = ImageDataBunch.from_name_re(path_img, fnames, pat, ds_tfms=get_transforms(), size=224)
data.normalize(imagenet_stats)
```

```
ImageDataBunch;
Train: LabelList (5912 items)
x: ImageList
Image (3, 224, 224),Image (3, 224, 224),Image (3, 224, 224),Image (3, 224, 224),Image (3, 224, 224)
y: CategoryList
chihuahua,scottish_terrier,British_Shorthair,scottish_terrier,yorkshire_terrier
Path: /content/data/oxford-iiit-pet/images;
Valid: LabelList (1478 items)
x: ImageList
Image (3, 224, 224),Image (3, 224, 224),Image (3, 224, 224),Image (3, 224, 224),Image (3, 224, 224)
y: CategoryList
Bengal,newfoundland,wheaten_terrier,samoyed,scottish_terrier
Path: /content/data/oxford-iiit-pet/images;
Test: None
```

```
data.show_batch(rows=3, figsize=(7,6))
```

We use the **data.classes** to indicate the total number of distinct labels that were found. it indicates the number of distinct labels that were extracted from the regular expression.

data.c has very much importance but in this context it gives the total number of classifications that were found in the dataset

```
print(data.classes)
len(data.classes),data.c
```

```
['Abyssinian', 'Bengal', 'Birman', 'Bombay', 'British_Shorthair', 'Egyptian_Mau', 'Maine_Coon', 'Persian', 'Ragdoll', 'Russian_Blue', 'Siamese', 'Sphynx', 'american_bulldog', 'american_pit_bull_terrier', 'basset_hound', 'beagle', 'boxer', 'chihuahua', 'english_cocker_spaniel', 'english_setter', 'german_shorthaired', 'great_pyrenees', 'havanese', 'japanese_chin', 'keeshond', 'leonberger', 'miniature_pinscher', 'newfoundland', 'pomeranian', 'pug', 'saint_bernard', 'samoyed', 'scottish_terrier', 'shiba_inu', 'staffordshire_bull_terrier', 'wheaten_terrier', 'yorkshire_terrier']
(37, 37)
```

This is where I got stuck in following the lesson. And now I understood what Jeremy Howard meant "spend most time in notebooks" and use of forums. Many people have tried this course and there is posibility that few would have got same error you are getting. So search for the error you are getting and if not found, ask in forum.

In the instruction code is directly from cnn_learner, but that requires pretrained model and downloading it was problem. If you want to know the error skip next code cell or see this To solve this you can download the model manually from pytorch to disk, upload it to colab and move it to the directory in the stack trace. Or better, download it straight to colab. Just use the link to download on the error trace. That can be done by the following code.

```
!cd /root/.cache/torch/checkpoints && curl -O https://download.pytorch.org/models/resnet34-333f7ec4.pth
```

```
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 83.2M 100 83.2M 0 0 17.2M 0 0:00:04 0:00:04 --:--:-- 18.3M
```

```
learn = cnn_learner(data, models.resnet34, metrics=error_rate)
learn.model
```

```
Sequential(
(0): Sequential(
(0): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
(1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU(inplace=True)
(3): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
(4): Sequential(
(0): BasicBlock(
(conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(1): BasicBlock(
(conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(2): BasicBlock(
(conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(5): Sequential(
(0): BasicBlock(
(conv1): Conv2d(64, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(downsample): Sequential(
(0): Conv2d(64, 128, kernel_size=(1, 1), stride=(2, 2), bias=False)
(1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(1): BasicBlock(
(conv1): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(2): BasicBlock(
(conv1): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(3): BasicBlock(
(conv1): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(6): Sequential(
(0): BasicBlock(
(conv1): Conv2d(128, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(downsample): Sequential(
(0): Conv2d(128, 256, kernel_size=(1, 1), stride=(2, 2), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(1): BasicBlock(
(conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(2): BasicBlock(
(conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(3): BasicBlock(
(conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(4): BasicBlock(
(conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(5): BasicBlock(
(conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(7): Sequential(
(0): BasicBlock(
(conv1): Conv2d(256, 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(downsample): Sequential(
(0): Conv2d(256, 512, kernel_size=(1, 1), stride=(2, 2), bias=False)
(1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(1): BasicBlock(
(conv1): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(2): BasicBlock(
(conv1): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
)
(1): Sequential(
(0): AdaptiveConcatPool2d(
(ap): AdaptiveAvgPool2d(output_size=1)
(mp): AdaptiveMaxPool2d(output_size=1)
)
(1): Flatten()
(2): BatchNorm1d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(3): Dropout(p=0.25, inplace=False)
(4): Linear(in_features=1024, out_features=512, bias=True)
(5): ReLU(inplace=True)
(6): BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(7): Dropout(p=0.5, inplace=False)
(8): Linear(in_features=512, out_features=37, bias=True)
)
)
```

We will train for 4 epochs (4 cycles through all our data). We create a learner object that takes the data, network and the metrics . The metrics is just used to print out how the training is performing. We choose to print out the error_rate

```
learn.fit_one_cycle(4)
```

epoch | train_loss | valid_loss | error_rate | time |
---|---|---|---|---|

0 | 1.427111 | 0.327753 | 0.100812 | 01:34 |

1 | 0.602179 | 0.248631 | 0.076455 | 01:33 |

2 | 0.383989 | 0.218113 | 0.066982 | 01:37 |

3 | 0.264342 | 0.202494 | 0.060217 | 01:39 |

```
learn.save('stage-1')
```

We got accuracy of 94% which is just 6% error with just few lines of code, when compared to state of art model in 2012 paper which had 56% accuracy.

]]>AI boom is happening all over the world and we often use buzzwords like Artificial Intelligence , machine learning, deep learning. But there are many misconception related to these words and it's easy to get lost and not see the difference between hype and reality. But do we actually know what each term means? These terms often seem like they're interchangeable, hence it's important to know the differences. Did we not get confused when someone asks us about these terms and we stumbled to differentiate between these? So let's see what are these concepts and how exactly are they different.

Whenever a machine completes tasks based on a set of stipulated rules that solve problems (algorithms), such an "intelligent" behavior is what is called artificial intelligence. AI systems will typically demonstrate at least some of the following behaviors associated with human intelligence: planning, learning, reasoning, problem solving, knowledge representation, perception, motion, and manipulation. For example such machines can move and manipulate objects, recognize the patterns and movement and solve bigger problems. AI is a widespread field and machine learning is a subset of AI. Deep learning is a subset of machine learning.

The simple definition says "Algorithms that parse data, learn from that data, and then apply what they've learned to make informed decisions" As the name suggests, machine learning can be interpreted as computer systems with the ability to "learn". The intention of ML is to enable machines to learn by themselves using the provided data and make accurate predictions. It is a process of training algorithm with data. Training in machine learning means giving a lot of data to the algorithm and allowing it to learn more about the processed information. Machine learning serves as function(understand the relationship between the input and the output) that takes data as input and predicts value as output. When we say something is capable of "machine learning", it means it's something that performs a function with the data given to it and gets progressively better over time. For example, if you had a bot(to clean room) that turned on whenever you said “room is dirty,” so it would recognize different phrases containing the word “dirty.”. Now, the way machines can learn new tricks gets really interesting when we see about deep learning.

DL algorithms are roughly inspired by the information processing patterns found in the human brain.Whenever we receive a new information, the brain tries to compare it to a known item before making sense of it . While basic machine learning models do become progressively better at whatever their function is, but they still need some guidance. If an AI algorithm returns an inaccurate prediction, then human adjustments are necessary. With a deep learning model, an algorithm can determine on its own if a prediction is accurate or not through its own neural network. A deep learning model is designed to continually analyze data with a logic structure similar to how a human would draw conclusions. Let’s go back to the bot(room cleaning) example: it could be programmed to turn on when it recognizes the audible cue of someone saying the word “dirty”. As it continues learning, it might eventually turn on with any phrase containing that word. Now if the bot had a deep learning model, it could figure out that it should turn on with the cues “floor is muddy” or “spilled juice on floor”. Deep Learning automatically finds out the features which are important for classification, comparing to Machine Learning where we had to manually give the features.

Do you now understand the difference between AI vs ML vs DL?

]]>**W**hat are new year resolutions?**W**hy new year resolutions are required?- ho
**W**to follow it efficiently without breaking any?

A New Year's resolution is a tradition, in which a person resolves to continue good practices, change an undesired trait or behavior, to accomplish a personal goal, or improve their life. New year resolutions are mainly made to improve from where we are. There are famous new year resolutions like get in shape, switch to healthy food style, meet new people and lot more. It is also started to leave a bad habit. for example no smoking. Or it can be made to start a good habit like waking up early in the morning.

In this we should cover both why required and why most of times it wont work.

New year resolutions can become strong motivatinal tool and can helping in developing a skill or start a good habit. But not always this works. If you break or dont follow your resolution it can be demotivating. But to start a habit new year's resolution can be great boon and helping in tracking progress when designed properly

According to U.S. News & World Report, the failure rate for New Year's resolutions is said to be about 80 percent, and most lose their resolve by mid-February. If this is true, it's clear that there is nothing wrong with us; the problem is in the tradition itself. Reason may be anything from lack of clarity to setting high expectations. I agree, but I think there's more to it than that. We are unreasonable about resolution. Out of over excitement we may start resolution but it may backfire when we realize resolutions are not practically possible.

Even though I'm writing this blog my statistics with new year resolution is not good. So I researched a lot to know how other people(specially developers) make and follow resolution. I went through a lot of blogs and I will list out what I found useful. I have not followed any but will implement along with you.

From Quincy Larson blog(mentioned at the end of this blog) I understood 3 important factors to make successful new year resolution.

Relevance: Does those new year resolution mean anything to you? If you love javascript then learning React( understanding difference between other frameworks) is relevant to you. If not then you make drop in between. Close your eyes and think that you have completed your resolution if that brings a smile on your face and boost your excitement then that is mostly relevant resolution you should take up.

Accountibility: This is to know if you can really do it. Tell you friends and family about it or post it in twitter, so that you may find yourself accountable

- support: This is whether you have friends or family who can help you finish resolution and can boost your confidence when you are dull. Divide resolutions into smaller chunks.

Divide into a smaller period and complete each so that you are tracking your progress. You can join existing social tags like #100daysofCode from freecodecamp or #100daysofML from siraj raval, or start your own tags.

I have read in some blog that identifying a word that can remind you about your complete resolution can be used. That word should create a positive vibe whenever you feel low. That word should inspire you and you can use it as mantra.

Make a resolution part of your routine. This is can be useful. Like if learning competitive coding is your new year's resolution then, practice it daily 1hr at the same time. So that it becomes part of your life. Hoping this blog as helped you and you have enjoyed reading. Credits: Inspired from Quincy Larson new year's resolution blog

I usually read technical blogs and found out that it is an awesome way to understand new technology. If your knowledge helps others then what else can be better? And the first step is to find where to post blogs. Medium is a wonderful website, but its better to also have a personal blog site. It is easy using Gatsby. I have tried to explain in detail how to create blog site and also all the difficulties and solutions in building this site.

This tutorial will use gatsby-personal-starter-blog, a Gatsby starter based on the official gatsby-starter-blog. The differences are that gatsby-personal-starter-blog is configured to run the blog on a subdirectory, /blog, and comes pre-installed with Netlify CMS for content editing. It also adds VS Code highlighting for code blocks.

Before we start you should have **github account** and basic understanding of react.

Lets start step by step

To check node version type `node --version`

and `npm --version`

in your terminal
If not installed then see nodejs docs

The Gatsby CLI tool helps you quickly create new Gatsby-powered sites and run commands for developing Gatsby sites. It is a published npm package.
The Gatsby CLI is available via npm and should be installed globally by running `npm install -g gatsby-cli`

Open your Terminal and run the following command from the Gatsby CLI to create a new Gatsby site using any one of the gatsby starter library I personally used both Gatsby starter blog and thomas's Gatsby personal starter blog for blog in /blog page.
so code is
`gatsby new [your-project-name] [github link of starter blog]`

for example
`gatsby new myblog https://github.com/gatsbyjs/gatsby-starter-blog`

Better use Gatsby-personal-starter-blog
Once the Gatsby site is finished installing all the packages and dependencies, you can now go into the directory and run the site locally.
`cd myblog/gatsby develop`

**If you get error in first code or in gatsby develop I have explained to debug at end of this blog.**
Now you can go to `localhost:8000`

to see your new site, but what’s great is that Netlify CMS is pre-installed and you can access it at localhost:8000/admin if you have used ** gatsby-personal-starter-blog**.
A CMS, or content management system, is useful because you can add content like blog posts from a dashboard on your site, instead of having to add posts manually with Markdown. However, you’ll likely want to be able to access the CMS from a deployed website, not just locally. For that, you’ll need to deploy to Netlify through GitHub, set up continuous deployment, and do a few configurations.
Open the project in your code editor and open static/admin/config.yml. Replace your-username/your-repo-name with your GitHub username and project name.
Open the project in your code editor(preferably vs code) and open

`static/admin/config.yml`

. Replace `your-username/your-repo-name`

with your GitHub username and project name. This step comes handy when using Netlify cms.```
backend:
-name:test-repo
+name: github+repo: your-username/your-repo-name
```

Customize your code according to your need like adding your info in bio.js and open github.com and create a new repository, with the same name as your project and push to github repo.

open app.netlify.com and add a “New site from Git”. Choose your newly created repo and click on “Deploy site” with the default deployment settings.

To make sure that Netlify CMS has access to your GitHub repo, you need to set up an OAuth application on GitHub. The instructions for that are here: Netlify’s Using an Authorization Provider. you may stop saving client id and secret, rest is already done.

Congrats! Now that Netlify CMS is successfully configured to your project, every time you add a new post, the content will be stored in your repostory and versioned on GitHub because Netlify CMS is Git-based. Also, thanks to Netlify’s Continuous Deployment, a new version will be deployed every time you add or edit a post.

**congrats!!! Finally done after long wait.**

Credits: Thomas Wang for explaining gatsby starter in official docs.

**1)**There may be problem with libvips so there is chance you may get error(I got one, common in fedora)
for this, delete `/Users/[your-username]/.npm/_libvips/[some .tar.gz]`

file. After deleting that .tar.gz file run `npm install`

now it works.