The post 20 Machine Learning/Artificial Intelligence Influencers To Follow In 2020 appeared first on HackerEarth Blog.
]]>Machine Learning (ML) is emerging as one of the hottest fields today. It has penetrated into numerous aspects of our everyday life—be it Siri or Alexa, Facebook/Instagram friend suggestions, Gmail spam filters, traffic congestion predictions, customer support chatbots, and much more. The Machine Learning market is ever-growing, predicted to scale up at a CAGR of 43.8% from 2019 to 2025, reaching up to an estimated evaluation of USD 96.7 billion by the end of 2025.
Consequently, there has been a significant increase in the number of Machine Learning enthusiasts across the globe. While there are scores of ML-related resources available across platforms, it might get quite overwhelming for beginners. Given this scenario, the first step should be to religiously follow ML/AI leaders, in order to seek advice and get insights on current trends and technologies.
We have curated a list of 20 influencers in ML/AI, inclusive of industry experts, thought leaders, academic professionals, and the likes.
1. Adam Coates
Follow @adampaulcoates
A Deep Learning leader, Adam Coates is currently the Director of Apple’s Special Projects Group. He has also served as the Director at Baidu’s Silicon Valley AI Lab, where his team worked on various technologies such as Deep Learning, Natural Language Processing (NLP), and High-Performance Computing (HPC). The team also created an end-to-end Deep Learning speech system, Deep Speech, and a multi-speaker text-to-speech engine, Deep Voice.
2. Alex Champandard
Follow @alexjc
With an experience of over twenty years in the Artificial Intelligence (AI) space, Alex Champandard is the co-founder of Creative.ai, a startup that aims at building AI/ML-powered tools for designers and artists. He has also co-organized Nucl.ai, one of the largest conferences dedicated to the use of AI technology in the creatives industry. Earlier in his career, he worked in computer entertainment and simulation industries for many years. He is also the author of AI Game Development: Synthetic Creatures with Learning and Reactive Behaviors, discussing various techniques and theories involved in the AI-based game development.
3. Andreas Mueller
Follow @amuellerml
Andreas Mueller is a Research Scientist at the Data Science Institute at Columbia University. He is also one of the core developers of scikit-learn—a Machine Learning library for Python. In addition, he has also co-authored Introduction to Machine Learning with Python, elaborating on practical approaches to Machine Learning using Python. Early on, he worked as an Assistant Research Scientist at the Center of Data Science at New York University and as a Machine Learning Scientist at Amazon. He is extremely passionate about open source and open science and is on a mission to make high-quality ML methods and applications that are easily applicable and available for everyone.
4. Andrew NG
Follow @AndrewYNg
Andrew NG is one of the most sought-after leaders in the Machine Learning arena. He co-founded Coursera and launched a Deep Learning specialization course—deeplearning.ai. He is also the founder and CEO of Landing AI, an organization that helps businesses become entirely AI-driven. Additionally, he is an adjunct professor for Computer Science and leads an AI research group at Stanford University. Formerly, he founded and led Google’s Deep Learning project, Google Brain, which was later deployed in numerous products such as object detection, speech recognition, street view, and more.
5. Dr. Craig Brown
Follow @DrDataScientist
With over thirty years of experience in working on multiple technological projects across all industries, Dr. Craig Brown is a technology expert and a thought leader. Primarily, his thought leadership is focused on leveraging Big Data, Machine Learning, and Data Science to drive and enhance an organization’s business, address business challenges, and lead innovation.
6. Dr. Fei-Fei Li
Follow @drfeifei
Dr. Fei-Fei Li is a Computer Science professor at Stanford University and the co-director of the Stanford University Human-Centered AI Institute. She is also the co-founder of a non-profit organization, AI4ALL, which aims at educating the next generation of AI enthusiasts. Prior to this, Dr. Fei-Fei Li served as the Chief Scientist of AI/ML and Vice President at Google Cloud, overseeing research, engineering, and development efforts for all AI/ML products of Google Cloud.
7. Gary Marcus
Follow @GaryMarcus
Gary Marcus is the CEO and founder of Robust.ai that offers a cognitive platform at an industrial level to enable smart, robust, and safe robots. He has recently co-authored Rebooting AI: Building Artificial Intelligence We Can Trust along with Ernest Davis. He is a cognitive scientist and a professor of Psychology and Neural Science at New York University. Prior to that, he served as a Director at the Uber AI Labs.
8. Geoffrey Hinton
Follow @geoffreyhinton
Fondly known as the Godfather of Deep Learning, Geoffrey Hinton is a professor in the Department of Computer Science at the University of Toronto. He recently joined Google’s AI research team, Google Brain, as a researcher. His expertise lies in artificial neural networks. Along with Yoshua Bengio and Yan LeCun, he has been termed as one of the Godfathers of AI, and co-received the 2018 ACM A.M. Turing Award. Furthermore, he has authored Neural Network Architectures for Artificial Intelligence.
9. Hilary Mason
Follow @hmason
Hilary Mason has been in the Data Science field for over twenty years now. With her passion for data, she became the founder and CEO of Fast Forward Labs that aimed at helping organizations use Machine Learning and Data Science advancements for scaling up their businesses. Fast Forward Labs was later acquired by Cloudera, where she went on to become the GM of Machine Learning. She is currently serving as a Data Scientist in Residence at Accel Partners, advising on various data strategies and investing opportunities. She also co-founded hackNY that mentors the next generation of New York’s developers for the creative technology community. Earlier in her career, she served as a Chief Scientist at Bitly.
10. Ian Goodfellow
Follow @goodfellow_ian
Currently employed as the Director of Machine Learning in the Special Projects Group at Apple Inc., Ian Goodfellow has majorly contributed to the Deep Learning space. He is the inventor of generative adversarial networks, an ML technique that is being used by Facebook. Earlier in his career, he worked with Google, playing a key role in Street Smart (Google Maps) and Google Brain (AI Research) teams. Besides that, he has also co-authored a comprehensive book, Deep Learning, alongside Yoshua Beng and Aaron Courville.
11. Jason Brownlee
Follow @TeachTheMachine
With the aim of ‘making developers awesome at Machine Learning’, Jason Brownlee founded the Machine Learning Mastery—a community offering various collaterals to help developers enhance their skills of applied Machine Learning.
12. Jess Hamrick
Follow @jhamrick
Currently employed as a research scientist at DeepMind, Jess Hamrick is a cognitive science enthusiast. Her key research area lies in human cognition by combining ML models with cognitive science. She is also one of the key maintainers of Jupyter/nbgrader—an open-source tool used to creating and grading assignments in the Jupyter notebook.
13. Dr. Kirk Borne
Follow @KirkDBorne
Dr. Kirk Borne, a data scientist and astrophysicist, is one of the leading influencers in the Big Data/Data Science/AI space. He is currently employed as the Principal Data Scientist and Executive Advisor at Booz Allen Hamilton. He has also been a professor of astrophysics and computational science at George Mason University for over twelve years. His work has majorly contributed to various projects including NASA’s Hubble Space Telescope.
14. Martin Ford
Follow @MFordFuture
Martin Ford is a well-acclaimed futurist and a keynote speaker, elaborating on topics such as AI and robotics, and their possible impacts on the market, economy, and society. He is also an author of three books, including the New York Times bestseller, Rise of the Robots: Technology and the Threat of a Jobless Future. He is also the Consulting Artificial Intelligence Expert for the Rise of the Robots Index project for Societe Generale Corporate and Investment Banking.
15. Mike Tamir
Follow @MikeTamir
Mike Tamir is currently the Chief Machine Learning Scientist and head of ML/AI at Susquehanna International Group, LLP (SIG). He is also a Data Science faculty member at UC Berkeley. Prior to this, he served as the Head of Data Science at Uber Advanced Technologies Group, and as the Chief Science Officer at Galvanize Inc. Earlier in his career, he was a faculty member at the University of Pittsburgh and Columbia University.
16. Oriol Vinyals
Follow @OriolVinyalsML
Oriol Vinyals is employed as a Principal Research Scientist at Google DeepMind, leading the Deep Learning team there. He has also led the AlphaStar team that developed the first AI that defeats the top professional players of the game, StarCraft. In the past, he was a Senior Research Scientist in the Google Brain team.
17. Peter Skomoroch
Follow @peteskomoroch
Presently serving as a senior executive and investor for numerous ML-driven startups and venture capital funds, Peter Skomoroch has over twenty years of experience in the Data Science industry. Over the years, he has worked as a Senior Research Engineer at the AOL Search Analytics team, Director of Analytics at Juice Analytics, Principal Data Scientist at LinkedIn, CEO and Co-founder of SkipFlag, and Head of AI Automation & Data Products at Workday, among various other roles. At LinkedIn, he played a key role in ideating, creating, and deploying LinkedIn Skills and Endorsements.
18. Soumith Chintala
Follow @soumithchintala
Soumith Chintala has co-created and led PyTorch, an open-source Machine Learning library developed by the Facebook AI Research lab for Computer Vision and Natural Language Processing applications. Having worked in the past on projects such as Google Street View House Numbers, pedestrian detection, sentiment analysis, and at New York University, he is also an extensive researcher in the ML space.
19. Yann LeCun
Follow @ylecun
Yann LeCun is the VP and Chief AI Scientist at Facebook, leading the scientific and technical AI research and development for the organization. In addition, he is a professor at New York University. Early on in his career, he headed the Image Processing Research Department at AT&T Labs Research. Being one of the Godfathers of AI, he has made a huge contribution in the field of Computer Vision and Optical Character Recognition. He is also one of the 2018 ACM A.M. Turing Award laureates for his contribution to the AI domain.
20. Yoshua Bengio
Yoshua Bengio is one of the pioneers in the ML space, owing to his work on artificial neural networks and Deep Learning. He has been a professor in the Department of Computer Science and Operations Research at the Université de Montréal for over twenty-five years. He also heads the Montreal Institute for Learning Algorithms. Yoshua Bengio, Geoffrey Hinton, and Yann LeCun are considered as the Godfathers of AI and have been awarded the 2018 ACM A.M. Turing Award for achieving major breakthroughs in deep neural networks.
The post 20 Machine Learning/Artificial Intelligence Influencers To Follow In 2020 appeared first on HackerEarth Blog.
]]>The post Hottest tech skills to hire for in 2020 appeared first on HackerEarth Blog.
]]>The benefits of honing technical skills go far beyond the Information Technology industry. Strong tech skills are essential in today’s changing world, and if your employees consistently and proactively enhance their IT skills, you will help them improve both personally and professionally. This, in turn, will help your business grow.
Yes, it may feel overwhelming. However, with the right attitude and flexibility of mind, it can also be a tremendous opportunity for your employees to learn and grow. Here are some of the hottest tech skills (a mix of programming languages, tools, and frameworks; in random order) to hire for in 2020, which will help you thrive in the workplace of tomorrow.
JavaScript has been the fastest-growing and the most sought-after programming language for years. It is considered as one of the smartest choices for building interactive web interfaces as all modern browsers support JavaScript.
Source: Twitter
The Stack Overflow developer survey results show that about 69.7% of 90,000 professional developers stated JavaScript is the most commonly used programming language. The same survey reveals that JavaScript is one of the most desired languages. This means that 17.8% of respondents have not yet used it but want to learn it.
The language is at the heart of several prominent tech companies, such as Netflix, PayPal, Groupon, LinkedIn, and Walmart. Additionally, studies reveal that JavaScript is among the most in-demand programming language used in the top privately-held startups valued at over $1 billion in the US. Hence, JavaScript will remain one of the hottest tech skills in 2020 and it is unlikely that it will go off the grid in the near future.
Some of the common job roles requiring JavaScript as a skill are:
Released in 1991 and created by Guido van Rossum, Python was and is still extremely relevant for all developers to learn and grow. It is interactive, dynamic, versatile, and remains one of the most relevant languages for the year 2020.
Source: Coding Dojo
Also, it is one of the most popular programming languages used by the top 25 unicorn companies in the US.
It is an all-time favorite of beginners and experienced developers alike, mainly for its ease of use and simple syntax. Right from programming projects such as data mining and Machine Learning, Python is the most favored programming language.
Also, read The complete guide to hiring a Python developer.
Some of the common job roles requiring Python as a skill are:
It is no surprise to see Java as one of the hottest tech skills to hire for in 2020. Introduced in 1991 by James Gosling, Mike Sheridan, and Patrick Naughton, it is a robust, general-purpose programming language that is object-oriented and class-based. It was designed in such a way that it is easy to use, write, compile, debug, and learn, and have as few implementation dependencies as possible.
Studies reveal that Java is one of the most popular programming languages used by developers.
This can be attributed to the fact that Java is widely used in industries such as financial services, Big Data, stock market, banking, retail, and Android. It is present everywhere! Whichever domain a developer works in, he/she will surely come across Java Programming.
An article by the Dev Community speaks about how Java is unarguably one of the most popular programming languages in the world today and how tech giants are using the language to build large portions of their infrastructure and backend services.
Also, read The complete guide to hiring a Java developer.
Some of the common job roles requiring Java as a skill are:
Conduct accurate coding assessments to hire the right developers. Request a demo.
For the fourth year in a row, Rust has been voted as the most loved programming language in a StackOverflow report, followed by Python. This meant that more developers want to continue working with Rust than other languages.
Also, as shown by Google Trends, Rust has been gaining tremendous popularity over the years and its adoption is expected to grow.
Tech companies like Google, Amazon, and Microsoft have invested in Rust as a long term system programming language because it is expected to replace a lot of C and C++ development. In fact, PyPl has ranked Rust 18th in the Popularity of Programming Language Index, with an upward trend.
It makes a little more sense when you find out that the language was created at Mozilla, giving web developers a chance to write code that’s more performant than Ruby, PHP, JavaScript, or Python.
Some of the common jobs requiring Rust as a tech skill are:
Released in 2013, ReactJS is essentially a front-end library created by Facebook for building user interfaces. It serves as an excellent tool for the development of full-scale, dynamic applications.
As per a Stack Overflow report, ReactJS is the most wanted and most loved web framework.
A great performance benefit of ReactJS is its ability to update virtual DOM. As Virtual DOM is rendered from the server-side as well as the client-side, it offers a high-performance rendering of complex user interfaces. This is why ReactJS is fast. Other than Facebook and Instagram, ReactJS is adopted by the BBC, Netflix, and PayPal.
Some of the common job roles requiring ReactJS as a skill are:
Looking to hire ReactJS developers? Identify top candidates with HackerEarth Assessments.
Docker is a tool that creates, deploys, and runs applications within containers.
Containers store up code and all its dependencies so that an application runs fast and reliably on any other Linux machine. The prevalence of Docker in the job market is incredible. In a Stack Overflow survey, developers ranked Docker number 2 in the “Most Loved Platform” category and number 1 in the “Most Wanted Platform” category.
With cloud and Docker becoming significantly linked every day, the demand is only expected to grow. Therefore, if your employees want a wonderful future in DevOps in 2020, they need to have a strong understanding of Docker tools.
Some of the common job roles requiring Docker as a skill are:
There would be no Data Science in Python without NumPy and Pandas (this is also one of the reasons why Python has become widely popular in Data Science.) As per GitHub, among the most popular public repositories labeled with topics like “Deep Learning,” “Natural Language Processing,” and “Machine Learning,” over half of them are built on NumPy. Pandas is a widely used tool, particularly in data munging and wrangling. It is available for everyone as an open-source, free-to-use project. Hence, NumPy and Pandas are expected to be in the race of tech skills to hire for in 2020.
Some of the common job roles requiring NumPy and Pandas as skills are:
Kotlin is a general-purpose programming language that effortlessly combines object-oriented and functional programming features within it. In a Stack Overflow report, Kotlin made its way into one of the most loved and most wanted programming languages.
Kotlin was designed to be interoperable with Java which makes Android development faster and enjoyable. Also, Kotlin addresses the major issues that surface in Java. Hence, several Java apps are rewritten in Kotlin. Brands like Pinterest and Coursera have already moved to Kotlin due to its strong tooling support. It receives a lot of interest from developers and companies alike. The job postings for Kotlin increased over 15X, from the second quarter of 2016 to the second quarter of 2018, and the trend is only expected to grow.
Source: Dice
Hence, Kotlin is a hot tech skill that programmers and Android app developers should learn in 2020.
Some of the common job roles requiring Kotlin as a skill are:
Django is one of the most versatile and popular Python web frameworks that encourages rapid development and pragmatic, clean design of web applications. This can be attributed to Django’s open-source nature—the community is constantly releasing new code and plug-ins to simplify the process and keep up with the demand. It grabbed eyeballs right from the start when it was positioned as Python’s answer to Rails.
Many Python development services, as well as major companies such as Spotify and YouTube, use Django. Developers describe it as “batteries included”, which means that it comes with a variety of third-party libraries.
Django’s user base is expected to grow as more developers embrace Python for emerging technologies such as Machine Learning and Big Data. It is ridiculously fast, reassuringly secure, and exceedingly scalable.
Note: We recommend using the latest version of Django, which is currently 3.0.2.
Some of the common job roles requiring Django as a skill are:
All developers have a thirst for learning new skills. However, knowing which skills are gaining popularity can ensure better career growth and help developers prioritize learning them first. Recruiters and developers can use the information shared in this post to make informed decisions in this matter.
As a recruiter, you need to keep yourself abreast of the above-mentioned skills to stay ahead of your competitors in hiring stellar talent.
Not sure about how to assess technical skills? HackerEarth provides accurate technical screening and helps you hire the best. Start your 14-day free trial today.
The post Hottest tech skills to hire for in 2020 appeared first on HackerEarth Blog.
]]>The post R Algorithms in AI and computing forces working together: A small industry insight appeared first on HackerEarth Blog.
]]>When it comes to understanding computing processes, especially in today’s front end and backend development world, most of the times everything revolves heavily around analyzing the algorithmic architecture in tools, applications, or more complex pieces of software.
In fact, a thorough analysis of what concerns the algorithmic side of things within the computing processing industry has led to a common conclusion— algorithmic functions are moving with architectural rendering languages to build much more complex tools.
Let’s analyze some of these.
The biggest Python application currently available for the mass market is the one related to front-end tools installed on enterprise sites.
This includes tools related to the web personalization industry, retargeting, remarketing, and Big Data manipulation, which are, in fact, a massive part of this statement.
The way these tools work is by restructuring a catalog onto specific user preferences.
This is done with the combination of Python features and R-rendering algorithms.
Python scripts are gathering big data from specific landing pages, which are then stored into a Javascript (generally) container.
After this is done, R algorithms are set up to render automatically the data, via (generally) AngularJS-coded scripts.
In this particular case, R functions are simply acting as a processing functionality.
The above-mentioned process (gathering via Python, processed in R, and then exported in JS) is pretty common in a variety of architecture and, depending on the usage, the only variable for what concerns which programming languages are used is related to the “export” side of the matter.
To better explain this, let’s analyze the most common programming languages— JavaScript and C#.
JavaScript exports are common within CMS-based tools (the ones, to reference, installed on architecture like WordPress, Magento, Shopify, etc) given the easiness of its application to these very portals.
C#, on the other hand, is used when the tool (or software) is native and, therefore, the rendering langue used to print the pieces of information must be tailored onto the building architecture.
Although for many, the matter could sound a bit dark and complicated, the combination of R algorithms to rendering languages (and computing power in general) could be aggregated within the AI sphere.
This is possible because, technically, those features (data gathering, processing, and printing) are related to AI as a whole.
Artificial pieces of intelligence in 2019 have moved, in fact, to this very matter: fast processing, personalization, and projections tailored onto Big Data, automatically gathered without any human input.
Futuristic projections of AI controlling our lives still live in science fiction and sometimes, given how they’re covered in many technology blogs/newspapers, these statements are extremely downgrading for an industry that is moving massively for what concerns both development and business awareness.
Pieces of software that are combining R algorithms and rendering languages as well as data automation have been covered by a variety of industry analysts.
These industry analysts have pointed out how they are building a futuristic architecture that is very likely to dominate the way we perceive data processing.
On top of everything that was said above, there is a significant part of the mobile market which is approaching the matter.
As we know, mobile has definitely become quite important, both from a development point of view (with new applications) and a purely business-related one (with many investors and new startups becoming enterprises).
Any app developers who have pointed out how algorithmic features within complex builds (especially on iOS) are now being embraced in the UK, which was recently selected as the European technological powerhouse.
We can safely say that this will become the industry standard in the near future.
Take a free tutorial to Python & Machine learning programming for better understanding.
The post R Algorithms in AI and computing forces working together: A small industry insight appeared first on HackerEarth Blog.
]]>The post Object detection for self-driving cars appeared first on HackerEarth Blog.
]]>In the previous blog, Introduction to Object detection, we learned the basics of object detection. We also got an overview of the YOLO (You Look Only Once algorithm). In this blog, we will extend our learning and will dive deeper into the YOLO algorithm. We will learn topics such as intersection over area metrics, non maximal suppression, multiple object detection, anchor boxes, etc. Finally, we will build an object detection detection system for a self-driving car using the YOLO algorithm. We will be using the Berkeley driving dataset to train our model.
Before, we get into building the various components of the object detection model, we will perform some preprocessing steps. The preprocessing steps involve resizing the images (according to the input shape accepted by the model) and converting the box coordinates into the appropriate form. Since we will be building a object detection for a self-driving car, we will be detecting and localizing eight different classes. These classes are ‘bike’, ‘bus’, ‘car’, ‘motor’, ‘person’, ‘rider’, ‘train’, and ‘truck’. Therefore, our target variable will be defined as:
where,
\begin{equation}
\hat{y} ={
\begin{bmatrix}
{p_c}& {b_x} & {b_y} & {b_h} & {b_w} & {c_1} & {c_2} & … & {c_8}
\end{bmatrix}}^T
\end{equation}
p_{c} : Probability/confidence of an object being present in the bounding box
b_{x}, b_{y} : coordinates of the center of the bounding box
b_{w} : width of the bounding box w.r.t the image width
b_{h} : height of the bounding box w.r.t the image height
c_{i} = Probability of the i_{th} class
But since the box coordinates provided in the dataset are in the following format: x_{min}, y_{min}, x_{max}, y_{max} (see Fig 1.), we need to convert them according to the target variable defined above. This can be implemented as follows:
W : width of the original image
H : height of the original image
\begin{equation}
b_x = \frac{(x_{min} + x_{max})}{2 * W}\ , \ b_y = \frac{(y_{min} + y_{max})}{2 * H} \\
b_w = \frac{(x_{max} – x_{min})}{2 * W}\ , \ b_y = \frac{(y_{max} + y_{min})}{2 * W}
\end{equation}
def process_data(images, boxes=None): """ Process the data """ images = [PIL.Image.fromarray(i) for i in images] orig_size = np.array([images[0].width, images[0].height]) orig_size = np.expand_dims(orig_size, axis=0) #Image preprocessing processed_images = [i.resize((416, 416), PIL.Image.BICUBIC) for i in images] processed_images = [np.array(image, dtype=np.float) for image in processed_images] processed_images = [image/255. for image in processed_images] if boxes is not None: # Box preprocessing # Original boxes stored as as 1D list of class, x_min, y_min, x_max, y_max boxes = [box.reshape((-1, 5)) for box in boxes] # Get extents as y_min, x_min, y_max, x_max, class for comparison with # model output box_extents = [box[:, [2,1,4,3,0]] for box in boxes] # Get box parameters as x_center, y_center, box_width, box_height, class. boxes_xy = [0.5* (box[:, 3:5] + box[:, 1:3]) for box in boxes] boxes_wh = [box[:, 3:5] - box[:, 1:3] for box in boxes] boxes_xy = [box_xy / orig_size for box_xy in boxes_xy] boxes_wh = [box_wh / orig_size for box_wh in boxes_wh] boxes = [np.concatenate((boxes_xy[i], boxes_wh[i], box[:, 0:1]), axis=-1) for i, box in enumerate(boxes)] # find the max number of boxes max_boxes = 0 for boxz in boxes: if boxz.shape[0] > max_boxes: max_boxes = boxz.shape[0] # add zero pad for training for i, boxz in enumerate(boxes): if boxz.shape[0] < max_boxes: zero_padding = np.zeros((max_boxes - boxz.shape[0], 5), dtype=np.float32) boxes[i] = np.vstack((boxz, zero_padding)) return np.array(processed_images), np.array(boxes) else: return np.array(processed_images)
Intersection over Union (IoU) is an evaluation metric that is used to measure the accuracy of an object detection algorithm. Generally, IoU is a measure of the overlap between two bounding boxes. To calculate this metric, we need:
Intersection over Union is the ratio of the area of intersection over the union area occupied by the ground truth bounding box and the predicted bounding box. Fig. 9 shows the IoU calculation for different bounding box scenarios.
Intersection over Union is the ratio of the area of intersection over the union area occupied by the ground truth bounding box and the predicted bounding box. Fig. 2 shows the IoU calculation for different bounding box scenarios.
Now, that we have a better understanding of the metric, let’s code it.
def IoU(box1, box2): """ Returns the Intersection over Union (IoU) between box1 and box2 Arguments: box1: coordinates: (x1, y1, x2, y2) box2: coordinates: (x1, y1, x2, y2) """ # Calculate the intersection area of the two boxes. xi1 = max(box1[0], box2[0]) yi1 = max(box1[1], box2[1]) xi2 = min(box1[2], box2[2]) yi2 = min(box1[3], box2[3]) area_of_intersection = (xi2 - xi1) * (yi2 - yi1) # Calculate the union area of the two boxes # A U B = A + B - A ∩ B A = (box1[2] - box1[0]) * (box1[3] - box1[1]) B = (box2[2] - box2[0]) * (box2[3] - box2[1]) union_area = A + B - area_of_intersection intersection_over_union = area_of_intersection/ union_area return intersection_over_union
Instead of building the model from scratch, we will be using a pre-trained network and applying transfer learning to create our final model. You only look once (YOLO) is a state-of-the-art, real-time object detection system, which has a mAP on VOC 2007 of 78.6% and a mAP of 48.1% on the COCO test-dev. YOLO applies a single neural network to the full image. This network divides the image into regions and predicts the bounding boxes and probabilities for each region. These bounding boxes are weighted by the predicted probabilities.
One of the advantages of YOLO is that it looks at the whole image during the test time, so its predictions are informed by global context in the image. Unlike R-CNN, which requires thousands of networks for a single image, YOLO makes predictions with a single network. This makes this algorithm extremely fast, over 1000x faster than R-CNN and 100x faster than Fast R-CNN.
If the target variable $# y $# is defined as
\begin{equation}
y ={
\begin{bmatrix}
{p_c}& {b_x} & {b_y} & {b_h} & {b_w} & {c_1} & {c_2} & {…} & {c_8}
\end{bmatrix}}^T \\
\begin{matrix}
& {y_1}& {y_2} & {y_3} & {y_4} & {y_5} & {y_6} & {y_7} & {…} & {y_{13}}
\end{matrix}
\end{equation}
the loss function for object localization is defined as
\begin{equation}
\mathcal{L(\hat{y}, y)} =
\begin{cases}
(\hat{y_1} – y_1)^2 + (\hat{y_2} – y_2)^2 + … + (\hat{y_{13}} – y_{13})^2 &&, y_1=1 \\
(\hat{y_1} – y_1)^2 &&, y_1=0
\end{cases}
\end{equation}
The loss function in case of the YOLO algorithm is calculated using the following steps:
Using the steps defined above, let’s calculate the loss function for the YOLO algorithm.
In general, the target variable is defined as
\begin{equation}
y ={
\begin{bmatrix}
{p_i(c)}& {x_i} & {y_i} & {h_i} & {w_i} & {C_i}
\end{bmatrix}}^T
\end{equation}
where,
p_{i}(c) : Probability/confidence of an object being present in the bounding box.
x_{i}, y_{i} : coordinates of the center of the bounding box.
w_{i} : width of the bounding box w.r.t the image width.
h_{i} : height of the bounding box w.r.t the image height.
C_{i} = Probability of the i_{th} class.
then the corresponding loss function is calculated as
where,
The above equation represents the yolo loss function. The equation may seem daunting at first, but on having a closer look we can see it is the sum of the coordinate loss, the classification loss, and the confidence loss in that order. We use sum of squared errors because it is easy to optimize. However, it weights the localization error equally with classification error which may not be ideal. To remedy this, we increase the loss from bounding box coordinate predictions and decrease the loss from confidence predictions for boxes that don’t contain objects. We use two parameters, λ_{coord} and λ_{noobj} to accomplish this.
Note that the loss function only penalizes classification error if an object is present in that grid cell. It also penalizes the bounding box coordinate error if that predictor is responsible for the ground truth box (i.e which has the highest IOU of any predictor in that grid cell).
The YOLO model has the following architecture (see Fig 3). The network has 24 convolutional layers followed by two fully connected layers. Alternating 1 × 1 convolutional layers reduce the features space from preceding layers. The convolutional layers are pretrained on the ImageNet classification task at half the resolution (224 × 224 input image) and then double the resolution for detection.
We will be using pre trained YOLOv2 model, which has been trained on the COCO image dataset with classes similar to the Berkeley Driving Dataset. So, we will use the YOLOv2 pretrained network as a feature extractor. We will load the pretrained weights of the YOLOv2 model and will freeze all the weights except for the last layer during training of the model. We will remove the last convolutional layer of the YOLOv2 model and replace it with a new convolutional layer indicating the number of classes (8 classes as defined earlier) to be predicted. This is implemented in the following code.
def create_model(anchors, class_names, load_pretrained=True, freeze_body = True): """ load_pretrained: whether or not to load the pretrained model or initialize all weights freeze_body: whether or not to freeze all weights except for the last layer Returns: model_body : YOLOv2 with new output layer model : YOLOv2 with custom loss Lambda layer """ detector_mask_shape = (13, 13, 5, 1) matching_boxes_shape = (13, 13, 5, 5) # Create model input layers image_input = Input(shape=(416,416,3)) boxes_input = Input(shape=(None, 5)) detector_mask_input = Input(shape=detector_mask_shape) matching_boxes_input = Input(shape=matching_boxes_shape) # Create model body yolo_model = yolo_body(image_input, len(anchors), len(class_names)) topless_yolo = Model(yolo_model.input, yolo_model.layers[-2].output) if load_pretrained == True: # Save topless yolo topless_yolo_path = os.path.join('model_data', 'yolo_topless.h5') if not os.path.exists(topless_yolo_path): print('Creating Topless weights file') yolo_path = os.path.join('model_data', 'yolo.h5') model_body = load_model(yolo_path) model_body = Model(model_body.inputs, model_body.layers[-2].output) model_body.save_weights(topless_yolo_path) topless_yolo.load_weights(topless_yolo_path) if freeze_body: for layer in topless_yolo.layers: layer.trainable = False final_layer = Conv2D(len(anchors)*(5 + len(class_names)), (1, 1), activation='linear')(topless_yolo.output) model_body = Model(image_input, final_layer) # Place model loss on CPU to reduce GPU memory usage. with tf.device('/cpu:0'): model_loss = Lambda(yolo_loss, output_shape=(1,), name='yolo_loss', arguments={ 'anchors': anchors, 'num_classes': len(class_names)})([model_body.output, boxes_input, detector_mask_input, matching_boxes_input]) model = Model([model_body.input, boxes_input, detector_mask_input, matching_boxes_input], model_loss) return model_body, model
Due to limited computational power, we used only the first 1000 images present in the training dataset to train the model. Finally, we trained the model for 20 epochs and saved the model weights with the lowest loss.
The YOLO object detection algorithm will predict multiple overlapping bounding boxes for a given image. As not all bounding boxes contain the object to be classified (e.g. pedestrian, bike, car or truck) or detected, we need to filter out those bounding boxes that don’t contain the target object. To implement this, we monitor the value of p_{c}, i.e., the probability or confidence of an object (i.e. the four classes) being present in the bounding box. If the value of p_{c} is less than the threshold value, then we filter out that bounding box from the predicted bounding boxes. This threshold may vary from model to model and serve as a hyper-parameter for the model.
If predicted target variable is defined as:
\begin{equation}
\hat{y} ={
\begin{bmatrix}
{p_c}& {b_x} & {b_y} & {b_h} & {b_w} & {c_1} & {c_2} & … & {c_8}
\end{bmatrix}}^T
\end{equation}
then discard all bounding boxes where the value of p_{c} < threshold value. The following code implements this approach.
def yolo_filter_boxes(box_confidence, boxes, box_class_probs, threshold=0.6): """ Filters YOLO boxes by thresholding on object and class confidence Arguments: box_confidence: Probability of the box containing the object boxes: The box parameters : (x, y, h, w) x, y -> Center of the box h, w -> Height and width of the box w.r.t the image size box_class_probs: Probability of all the classes for each box threshold: Threshold value for box confidence Returns: scores: containing the class probability score for the selected boxes boxes: contains box coordinates for the selected boxes classes: contains the index of the class detected by the selected boxes """ # Compute the box scores: box_scores = box_confidence * box_class_probs # Find the box classes index with the maximum box score box_classes = K.argmax(box_scores) # Find the box classes with maximum box score box_class_scores = K.max(box_scores, axis=-1) # Creating a mask for selecting the boxes that have box score greater than threshold thresh_mask = box_class_scores >= threshold # Selecting the scores, boxes and classes with box score greater than # threshold by filtering the box score with the help of thresh_mask. scores = tf.boolean_mask(tensor=box_class_scores, mask=thresh_mask) classes = tf.boolean_mask(tensor=box_classes, mask=thresh_mask) boxes = tf.boolean_mask(tensor=boxes, mask=thresh_mask) return scores, classes, boxes
Even after filtering by thresholding over the classes score, we may still end up with a lot of overlapping bounding boxes. This is because the YOLO algorithm may detect an object multiple times, which is one of its drawbacks. A second filter called non-maximal suppression (NMS) is used to remove duplicate detections of an object. Non-max suppression uses ‘Intersection over Union’ (IoU) to fix multiple detections.
Non-maximal suppression is implemented as follows:
In case there are multiple classes/ objects, i.e., if there are four objects/classes, then non-max suppression will run four times, once for every output class.
One of the drawbacks of YOLO algorithm is that each grid can only detect one object. What if we want to detect multiple distinct objects in each grid. For example, if two objects or classes are overlapping and share the same grid as shown in the image (see Fig 4.),
We make use of anchor boxes to tackle the issue. Let’s assume the predicted variable is defined as
\begin{equation}
\hat{y} ={
\begin{bmatrix}
{p_c}& {b_x} & {b_y} & {b_h} & {b_w} & {c_1} & {c_2} & {…} & {c_8}
\end{bmatrix}}^T
\end{equation}
then, we can use two anchor boxes in the following manner to detect two objects in the image simultaneously.
Earlier, the target variable was defined such that each object in the training image is assigned to grid cell that contains that object’s midpoint. Now, with two anchor boxes, each object in the training images is assigned to a grid cell that contains the object’s midpoint and anchor box for the grid cell with the highest IOU. So, with the help of two anchor boxes, we can detect at most two objects simultaneously in an image. Fig 6. shows the shape of the final output layer with and without the use of anchor boxes.
Although, we can detect multiple images using Anchor boxes, but they still have limitations. For example, if there are two anchor boxes defined in the target variable and the image has three overlapping objects, then the algorithm fails to detect all three objects. Secondly, if two anchor boxes are associated with two objects but have the same midpoint in the box coordinates, then the algorithm fails to differentiate between the objects. Now, that we know the basics of anchor boxes, let’s code it.
In the following code we will use 10 anchor boxes. As a result, the algorithm can detect at maximum of 10 objects in a given image.
def non_max_suppression(scores, classes, boxes, max_boxes=10, iou_threshold = 0.5): """ Non-maximal suppression is used to fix the multiple detections of the same object. - Find the box_confidence (Probability of the box containing the object) for each detection. - Find the bounding box with the highest box_confidence - Suppress all the bounding boxes which have an IoU greater than 0.5 with the bounding box with the maximum box confidence. scores -> containing the class probability score for the selected boxes. boxes -> contains box coordinates for the boxes selected after threshold masking. classes -> contains the index of the classes detected by the selected boxes. max_boxes -> maximum number of predicted boxes to be returned after NMS filtering. Returns: scores -> predicted score for each box. classes -> predicted class for each box. boxes -> predicted box coordinates. """ # Converting max_boxes to tensor max_boxes_tensor = K.variable(max_boxes, dtype='int32') # Initialize the max_boxes_tensor K.get_session().run(tf.variables_initializer([max_boxes_tensor])) # Implement non-max suppression using tf.image.non_max_suppression() # tf.image.non_max_suppression() -> Returns the indices corresponding to the boxes you want to keep indices = tf.image.non_max_suppression(boxes=boxes, scores=scores, max_output_size=max_boxes_tensor, iou_threshold=iou_threshold) # K.gather() is used to select only indices present in 'indices' variable from scores, boxes and classe scores = tf.gather(scores, indices) classes = tf.gather(classes, indices) boxes = tf.gather(boxes, indices) return scores, classes , boxes
We can combine both the concepts threshold filtering and non-maximal suppression and apply it on the output predicted by the YOLO model. This is implemented in the code below.
def yolo_eval(yolo_outputs, image_shape = (720., 1280.), max_boxes = 10, score_threshold = 0.6, iou_threshold = 0.5): """ The function takes the output of the YOLO encoding/ model and filters the boxes using score threshold and non-maximal suppression. Returns the predicted boxes along with their scores, box coordinates and classes. Arguments: yolo_outputs -> Output of the encoding model. image_shape -> Input shape max_boxes -> Maximum number of predicted boxes to be returned after NMS filtering. score_threshold -> Threshold value for box class score, if the maximum class probability score < threshold, then discard that box. iou_threshold -> 'Intersection over Union' threshold used for NMS filtering Returns: scores -> predicted score for each box. classes -> predicted class for each box. boxes -> predicted box coordinates. """ box_xy, box_wh, box_confidence, box_class_probs = yolo_outputs # Convert boxes to be ready for filtering functions boxes = yolo_boxes_to_corners(box_xy, box_wh) scores, classes, boxes = yolo_filter_boxes(box_confidence, boxes, box_class_probs, score_threshold) # Scale boxes back to original image shape. boxes = scale_boxes(boxes, image_shape) # Perform non-max suppression scores, classes , boxes = non_max_suppression(scores, classes, boxes, max_boxes, iou_threshold) return scores, boxes, classes
We will use the trained model to predict the respective classes and the corresponding bounding boxes on a sample of images. The function ‘draw’ runs a tensorflow session and calculates the confidence scores, bounding box coordinates and the output class probabilities for the given sample image. Finally, it computes the x_{min}, x_{max}, y_{min}, y_{max} from b_{x},b_{y},b_{w},b_{h}, scales the bounding boxes according to the input sample image and draws the bounding boxes and class probability for the objects in the input sample image.
# Loading the path of the test image data test = glob('data/test/*.jpg') # Reading and storing the test image data test_data = [] for i in test: test_data.append(plt.imread(i)) # Processing the test image data test_data = process_data(test_data) # Predicting the scores, boxes, classes for the given input image scores, boxes, classes, model_body, input_image_shape = load_yolo(model_body, class_names, anchors) # Drawing the bounding boxes draw(model_body, scores, boxes, classes,input_image_shape, test_data, image_set='all', out_path='data/test/output/',save_all=False)
Next, we will implement the model on a real time video. Since, video is a sequence of images at different time frames, so we will predict the class probabilities and bounding boxes for the image captured at each time frame. We will use OpenCV video capture function to read the video and convert it into image/ frames at different time steps. The video below demonstrates the implementation of the algorithm on a real time video.
#Path of the stored video file videopath = 'data/real_time/bdd-videos-sample.mp4' # Loads the saved trained YOLO model scores, boxes, classes, model_body, input_image_shape = load_yolo(model_body, class_names, anchors) # Catures and splits the video into images at different time frames vc = cv2.VideoCapture(videopath)
while(True): # Load the image at each time frame check, frame = vc.read() # Preprocess the input image frame frame = process_data(np.expand_dims(frame, axis=0)) # Predict and draw the class probabilities and bounding boxes for the given frame img_data = draw(model_body, scores, boxes, classes, input_image_shape, frame, image_set='real', save_all=False, real_time=True) img_data = np.array(img_data) # Display the image/ frame with the predicted class probability and bounding boxes back on the screen. cv2.imshow('Capture:', img_data) key = cv2.waitKey(1) if key == ord('q'): break vc.release() cv2.destroyAllWindows()
This brings us to the end of this article. Congratulate yourself on reaching to the end of this blog. As a reward you now have a better understanding of how object detection works (using the YOLO algorithm) and how self driving cars implement this technique to differentiate between cars, trucks, pedestrians, etc. to make better decisions. Finally, I encourage you to implement and play with the code yourself. You can find the full source code related to this article here.
Have anything to say? Feel free to comment below for any questions, suggestions, and discussions related to this article. Till then, keep hacking with HackerEarth.
Struggling to compose your own music, check out this blog on how to Compose Jazz Music with Deep Learning.
The post Object detection for self-driving cars appeared first on HackerEarth Blog.
]]>The post Introduction to Object Detection appeared first on HackerEarth Blog.
]]>Humans can easily detect and identify objects present in an image. The human visual system is fast and accurate and can perform complex tasks like identifying multiple objects and detect obstacles with little conscious thought. With the availability of large amounts of data, faster GPUs, and better algorithms, we can now easily train computers to detect and classify multiple objects within an image with high accuracy. In this blog, we will explore terms such as object detection, object localization, loss function for object detection and localization, and finally explore an object detection algorithm known as “You only look once” (YOLO).
An image classification or image recognition model simply detect the probability of an object in an image. In contrast to this, object localization refers to identifying the location of an object in the image. An object localization algorithm will output the coordinates of the location of an object with respect to the image. In computer vision, the most popular way to localize an object in an image is to represent its location with the help of bounding boxes. Fig. 1 shows an example of a bounding box.
A bounding box can be initialized using the following parameters:
The target variable for a multi-class image classification problem is defined as:
\begin{equation}
y =
\begin{bmatrix}
{c_1} & \\
{c_2} & \\
{c_3} & \\
{c_4}
\end{bmatrix}
\end{equation}
We can extend this approach to define the target variable for object localization. The target variable is defined as
\begin{equation}
y =
\begin{bmatrix}
{p_c} & \\
{b_x} & \\
{b_y} & \\
{b_h} & \\
{b_w} & \\
{c_1} & \\
{c_2} & \\
{c_3} & \\
{c_4}
\end{bmatrix}
\end{equation}
where,
$#\smash{p_c}$# = Probability/confidence of an object (i.e the four classes) being present in the bounding box.
$#\smash{b_x, b_y, b_h, b_w}$# = Bounding box coordinates.
$#\smash{c_i}$# = Probability of the $#\smash{i_{th}}$# class the object belongs to.
For example, the four classes be ‘truck’, ‘car’, ‘bike’, ‘pedestrian’ and their probabilities are represented as $#c_1, c_2, c_3, c_4$#. So,
\begin{equation}
p_c =
\begin{cases}
1,\ \ c_i: \{c_1, c_2, c_3, c_4\} && \\
0,\ \ otherwise
\end{cases}
\end{equation}
Let the values of the target variable $#y$# are represented as $#y_1$#, $#y_2$#, $#…,\ y_9$#.
\begin{equation}
y ={
\begin{bmatrix}
{p_c}& {b_x} & {b_y} & {b_h} & {b_w} & {c_1} & {c_2} & {c_3} & {c_4}
\end{bmatrix}}^T \\
\begin{matrix}
& {y_1}& {y_2} & {y_3} & {y_4} & {y_5} & {y_6} & {y_7} & {y_8} & {y_9}
\end{matrix}
\end{equation}
The loss function for object localization will be defined as
\begin{equation}
\mathcal{L(\hat{y}, y)} =
\begin{cases}
(\hat{y_1} – y_1)^2 + (\hat{y_8} – y_8)^2 + … + (\hat{y_9} – y_9)^2 &&, y_1=1 \\
(\hat{y_1} – y_1)^2 &&, y_1=0
\end{cases}
\end{equation}
In practice, we can use a log function considering the softmax output in case of the predicted classes ($#c_1, c_2, c_3, c_4$#). While for the bounding box coordinates, we can use something like a squared error and for $#p_c$# (confidence of object) we can use logistic regression loss.
Since we have defined both the target variable and the loss function, we can now use neural networks to both classify and localize objects.
An approach to building an object detection is to first build a classifier that can classify closely cropped images of an object. Fig 2. shows an example of such a model, where a model is trained on a dataset of closely cropped images of a car and the model predicts the probability of an image being a car.
Now, we can use this model to detect cars using a sliding window mechanism. In a sliding window mechanism, we use a sliding window (similar to the one used in convolutional networks) and crop a part of the image in each slide. The size of the crop is the same as the size of the sliding window. Each cropped image is then passed to a ConvNet model (similar to the one shown in Fig 2.), which in turn predicts the probability of the cropped image is a car.
After running the sliding window through the whole image, we resize the sliding window and run it again over the image again. We repeat this process multiple times. Since we crop through a number of images and pass it through the ConvNet, this approach is both computationally expensive and time-consuming, making the whole process really slow. Convolutional implementation of the sliding window helps resolve this problem.
Before we discuss the implementation of the sliding window using convents, let’s analyze how we can convert the fully connected layers of the network into convolutional layers. Fig. 4 shows a simple convolutional network with two fully connected layers each of shape (400, ).
A fully connected layer can be converted to a convolutional layer with the help of a 1D convolutional layer. The width and height of this layer are equal to one and the number of filters are equal to the shape of the fully connected layer. An example of this is shown in Fig 5.
We can apply this concept of conversion of a fully connected layer into a convolutional layer to the model by replacing the fully connected layer with a 1-D convolutional layer. The number of the filters of the 1D convolutional layer is equal to the shape of the fully connected layer. This representation is shown in Fig 6. Also, the output softmax layer is also a convolutional layer of shape (1, 1, 4), where 4 is the number of classes to predict.
Now, let’s extend the above approach to implement a convolutional version of sliding window. First, let’s consider the ConvNet that we have trained to be in the following representation (no fully connected layers).
Let’s assume the size of the input image to be 16 × 16 × 3. If we’re to use a sliding window approach, then we would have passed this image to the above ConvNet four times, where each time the sliding window crops a part of the input image of size 14 × 14 × 3 and pass it through the ConvNet. But instead of this, we feed the full image (with shape 16 × 16 × 3) directly into the trained ConvNet (see Fig. 7). This results in an output matrix of shape 2 × 2 × 4. Each cell in the output matrix represents the result of a possible crop and the classified value of the cropped image. For example, the left cell of the output (the green one) in Fig. 7 represents the result of the first sliding window. The other cells represent the results of the remaining sliding window operations.
Note that the stride of the sliding window is decided by the number of filters used in the Max Pool layer. In the example above, the Max Pool layer has two filters, and as a result, the sliding window moves with a stride of two resulting in four possible outputs. The main advantage of using this technique is that the sliding window runs and computes all values simultaneously. Consequently, this technique is really fast. Although a weakness of this technique is that the position of the bounding boxes is not very accurate.
A better algorithm that tackles the issue of predicting accurate bounding boxes while using the convolutional sliding window technique is the YOLO algorithm. YOLO stands for you only look once and was developed in 2015 by Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi. It’s popular because it achieves high accuracy while running in real time. This algorithm is called so because it requires only one forward propagation pass through the network to make the predictions.
The algorithm divides the image into grids and runs the image classification and localization algorithm (discussed under object localization) on each of the grid cells. For example, we have an input image of size 256 × 256. We place a 3 × 3 grid on the image (see Fig. 8).
Next, we apply the image classification and localization algorithm on each grid cell. For each grid cell, the target variable is defined as
\begin{equation}
y_{i, j} ={
\begin{bmatrix}
{p_c}& {b_x} & {b_y} & {b_h} & {b_w} & {c_1} & {c_2} & {c_3} & {c_4}
\end{bmatrix}}^T
\end{equation}
Do everything once with the convolution sliding window. Since the shape of the target variable for each grid cell is 1 × 9 and there are 9 (3 × 3) grid cells, the final output of the model will be:
The advantages of the YOLO algorithm is that it is very fast and predicts much more accurate bounding boxes. Also, in practice to get more accurate predictions, we use a much finer grid, say 19 × 19, in which case the target output is of the shape 19 × 19 × 9.
With this, we come to the end of the introduction to object detection. We now have a better understanding of how we can localize objects while classifying them in an image. We also learned to combine the concept of classification and localization with the convolutional implementation of the sliding window to build an object detection system. In the next blog, we will go deeper into the YOLO algorithm, loss function used, and implement some ideas that make the YOLO algorithm better. Also, we will learn to implement the YOLO algorithm in real time.
Have anything to say? Feel free to comment below for any questions, suggestions, and discussions related to this article. Till then, keep hacking with HackerEarth.
The post Introduction to Object Detection appeared first on HackerEarth Blog.
]]>The post Data Visualization for Beginners-Part 3 appeared first on HackerEarth Blog.
]]>Bonjour! Welcome to another part of the series on data visualization techniques. In the previous two articles, we discussed different data visualization techniques that can be applied to visualize and gather insights from categorical and continuous variables. You can check out the first two articles here:
In this article, we’ll go through the implementation and use of a bunch of data visualization techniques such as heat maps, surface plots, correlation plots, etc. We will also look at different techniques that can be used to visualize unstructured data such as images, text, etc.
### Importing the required libraries
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import plotly.plotly as py
import plotly.graph_objs as go
%matplotlib inline
A heat map(or heatmap) is a two-dimensional graphical representation of the data which uses colour to represent data points on the graph. It is useful in understanding underlying relationships between data values that would be much harder to understand if presented numerically in a table/ matrix.
### We can create a heatmap by simply using the seaborn library.
sample_data = np.random.rand(8, 12)
ax = sns.heatmap(sample_data)
Let’s understand this using an example. We’ll be using the metadata from Deep Learning 3 challenge. Link to the dataset. Deep Learning 3 challenged the participants to predict the attributes of animals by looking at their images.
### Training metadata contains the name of the image and the corresponding attributes associated with the animal in the image.
train = pd.read_csv('meta-data/train.csv')
train.head()
We will be analyzing how often an attribute occurs in relationship with the other attributes. To analyze this relationship, we will compute the co-occurrence matrix.
### Extracting the attributes
cols = list(train.columns)
cols.remove('Image_name')
attributes = np.array(train[cols])
print('There are {} attributes associated with {} images.'.format(attributes.shape[1],attributes.shape[0]))
Out: There are 85 attributes associated with 12,600 images.
# Compute the co-occurrence matrix
cooccurrence_matrix = np.dot(attributes.transpose(), attributes)
print('\n Co-occurrence matrix: \n', cooccurrence_matrix)
Out: Co-occurrence matrix:
[[5091 728 797 ... 3797 728 2024]
[ 728 1614 0 ... 669 1614 1003]
[ 797 0 1188 ... 1188 0 359]
...
[3797 669 1188 ... 8305 743 3629]
[ 728 1614 0 ... 743 1933 1322]
[2024 1003 359 ... 3629 1322 6227]]
# Normalizing the co-occurrence matrix, by converting the values into a matrix
# Compute the co-occurrence matrix in percentage
#Reference:https://stackoverflow.com/questions/20574257/constructing-a-co-occurrence-matrix-in-python-pandas/20574460
cooccurrence_matrix_diagonal = np.diagonal(cooccurrence_matrix)
with np.errstate(divide = 'ignore', invalid='ignore'):
cooccurrence_matrix_percentage = np.nan_to_num(np.true_divide(cooccurrence_matrix, cooccurrence_matrix_diagonal))
print('\n Co-occurrence matrix percentage: \n', cooccurrence_matrix_percentage)
We can see that the values in the co-occurrence matrix represent the occurrence of each attribute with the other attributes. Although the matrix contains all the information, it is visually hard to interpret and infer from the matrix. To counter this problem, we will use heat maps, which can help relate the co-occurrences graphically.
fig = plt.figure(figsize=(10, 10))
sns.set(style='white')
# Draw the heatmap with the mask and correct aspect ratio
ax = sns.heatmap(cooccurrence_matrix_percentage, cmap='viridis', center=0, square=True, linewidths=0.15, cbar_kws={"shrink": 0.5, "label": "Co-occurrence frequency"}, )
ax.set_title('Heatmap of the attributes')
ax.set_xlabel('Attributes')
ax.set_ylabel('Attributes')
plt.show()
Since the frequency of the co-occurrence is represented by a colour pallet, we can now easily interpret which attributes appear together the most. Thus, we can infer that these attributes are common to most of the animals.
Choropleths are a type of map that provides an easy way to show how some quantity varies across a geographical area or show the level of variability within a region. A heat map is similar but doesn’t include geographical boundaries. Choropleth maps are also appropriate for indicating differences in the distribution of the data over an area, like ownership or use of land or type of forest cover, density information, etc. We will be using the geopandas library to implement the choropleth graph.
We will be using choropleth graph to visualize the GDP across the globe. Link to the dataset.
# Importing the required libraries
import geopandas as gpd
from shapely.geometry import Point
from matplotlib import cm
# GDP mapped to the corresponding country and their acronyms
df =pd.read_csv('GDP.csv')
df.head()
COUNTRY | GDP (BILLIONS) | CODE | |
0 | Afghanistan | 21.71 | AFG |
1 | Albania | 13.40 | ALB |
2 | Algeria | 227.80 | DZA |
3 | American Samoa | 0.75 | ASM |
4 | Andorra | 4.80 | AND |
### Importing the geometry locations of each country on the world map
geo = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))[['iso_a3', 'geometry']]
geo.columns = ['CODE', 'Geometry']
geo.head()
# Mapping the country codes to the geometry locations
df = pd.merge(df, geo, left_on='CODE', right_on='CODE', how='inner')
#converting the dataframe to geo-dataframe
geometry = df['Geometry']
df.drop(['Geometry'], axis=1, inplace=True)
crs = {'init':'epsg:4326'}
geo_gdp = gpd.GeoDataFrame(df, crs=crs, geometry=geometry)
## Plotting the choropleth
cpleth = geo_gdp.plot(column='GDP (BILLIONS)', cmap=cm.Spectral_r, legend=True, figsize=(8,8))
cpleth.set_title('Choropleth Graph - GDP of different countries')
Surface plots are used for the three-dimensional representation of the data. Rather than showing individual data points, surface plots show a functional relationship between a dependent variable (Z) and two independent variables (X and Y).
It is useful in analyzing relationships between the dependent and the independent variables and thus helps in establishing desirable responses and operating conditions.
from mpl_toolkits.mplot3d import Axes3D
from matplotlib.ticker import LinearLocator, FormatStrFormatter
# Creating a figure
# projection = '3d' enables the third dimension during plot
fig = plt.figure(figsize=(10,8))
ax = fig.gca(projection='3d')
# Initialize data
X = np.arange(-5,5,0.25)
Y = np.arange(-5,5,0.25)
# Creating a meshgrid
X, Y = np.meshgrid(X, Y)
R = np.sqrt(np.abs(X**2 - Y**2))
Z = np.exp(R)
# plot the surface
surf = ax.plot_surface(X, Y, Z, cmap=cm.GnBu, antialiased=False)
# Customize the z axis.
ax.zaxis.set_major_locator(LinearLocator(10))
ax.zaxis.set_major_formatter(FormatStrFormatter('%.02f'))
ax.set_title('Surface Plot')
# Add a color bar which maps values to colors.
fig.colorbar(surf, shrink=0.5, aspect=5)
plt.show()
One of the main applications of surface plots in machine learning or data science is the analysis of the loss function. From a surface plot, we can analyze how the hyperparameters affect the loss function and thus help prevent overfitting of the model.
Dimensionality refers to the number of attributes present in the dataset. For example, consumer-retail datasets can have a vast amount of variables (e.g. sales, promos, products, open, etc.). As a result, visually exploring the dataset to find potential correlations between variables becomes extremely challenging.
Therefore, we use a technique called dimensionality reduction to visualize higher dimensional datasets. Here, we will focus on two such techniques :
Before we jump into understanding PCA, let’s review some terms:
A positive covariance means X and Y are positively related, i.e., if X increases, Y increases, while negative covariance means the opposite relation. However, zero variance means X and Y are not related.
PCA is the orthogonal projection of data onto a lower-dimension linear space that maximizes variance (green line) of the projected data and minimizes the mean squared distance between the data point and the projects (blue line). The variance describes the direction of maximum information while the mean squared distance describes the information lost during projection of the data onto the lower dimension.
Thus, given a set of data points in a d-dimensional space, PCA projects these points onto a lower dimensional space while preserving as much information as possible.
In the figure, the component along the direction of maximum variance is defined as the first principal axis. Similarly, the component along the direction of second maximum variance is defined as the second principal component, and so on. These principal components are referred to the new dimensions carrying the maximum information.
# We will use the breast cancer dataset as an example
# The dataset is a binary classification dataset
# Importing the dataset
from sklearn.datasets import load_breast_cancer
data = load_breast_cancer()
X = pd.DataFrame(data=data.data, columns=data.feature_names) # Features
y = data.target # Target variable
# Importing PCA function
from sklearn.decomposition import PCA
pca = PCA(n_components=2) # n_components = number of principal components to generate
# Generating pca components from the data
pca_result = pca.fit_transform(X)
print("Explained variance ratio : \n",pca.explained_variance_ratio_)
Out: Explained variance ratio :
[0.98204467 0.01617649]
We can see that 98% (approx) variance of the data is along the first principal component, while the second component only expresses 1.6% (approx) of the data.
# Creating a figure
fig = plt.figure(1, figsize=(10, 10))
# Enabling 3-dimensional projection
ax = fig.gca(projection='3d')
for i, name in enumerate(data.target_names):
ax.text3D(np.std(pca_result[:, 0][y==i])-i*500 ,np.std(pca_result[:, 1][y==i]),0,s=name, horizontalalignment='center', bbox=dict(alpha=.5, edgecolor='w', facecolor='w'))
# Plotting the PCA components
ax.scatter(pca_result[:,0], pca_result[:, 1], c=y, cmap = plt.cm.Spectral,s=20, label=data.target_names)
plt.show()
Thus, with the help of PCA, we can get a visual perception of how the labels are distributed across given data (see Figure).
T-distributed Stochastic Neighbour Embeddings (t-SNE) is a non-linear dimensionality reduction technique that is well suited for visualization of high-dimensional data. It was developed by Laurens van der Maten and Geoffrey Hinton. In contrast to PCA, which is a mathematical technique, t-SNE adopts a probabilistic approach.
PCA can be used for capturing the global structure of the high-dimensional data but fails to describe the local structure within the data. Whereas, “t-SNE” is capable of capturing the local structure of the high-dimensional data very well while also revealing global structure such as the presence of clusters at several scales. t-SNE converts the similarity between data points to joint probabilities and tries to maximize the Kullback-Leibler divergence between the joint probabilities of the low-dimensional embeddings and high-dimension data. In doing so, it preserves the original structure of the data.
# We will be using the scikit learn library to implement t-SNE
# Importing the t-SNE library
from sklearn.manifold import TSNE
# We will be using the iris dataset for this example
from sklearn.datasets import load_iris
# Loading the iris dataset
data = load_iris()
# Extracting the features
X = data.data
# Extracting the labels
y = data.target
# There are four features in the iris dataset with three different labels.
print('Features in iris data:\n', data.feature_names)
print('Labels in iris data:\n', data.target_names)
Out: Features in iris data:
['sepal length (cm)', 'sepal width (cm)', 'petal length (cm)', 'petal width (cm)']
Labels in iris data:
['setosa' 'versicolor' 'virginica']
# Loading the TSNE model
# n_components = number of resultant components
# n_iter = Maximum number of iterations for the optimization.
tsne_model = TSNE(n_components=3, n_iter=2500, random_state=47)
# Generating new components
new_values = tsne_model.fit_transform(X)
labels = data.target_names
# Plotting the new dimensions/ components
fig = plt.figure(figsize=(5, 5))
ax = Axes3D(fig, rect=[0, 0, .95, 1], elev=48, azim=134)
for label, name in enumerate(labels):
ax.text3D(new_values[y==label, 0].mean(),
new_values[y==label, 1].mean() + 1.5,
new_values[y==label, 2].mean(), name,
horizontalalignment='center',
bbox=dict(alpha=.5, edgecolor='w', facecolor='w'))
ax.scatter(new_values[:,0], new_values[:,1], new_values[:,2], c=y)
ax.set_title('High-Dimension data visualization using t-SNE', loc='right')
plt.show()
Thus, by reducing the dimensions using t-SNE, we can visualize the distribution of the labels over the feature space. We can see that in the figure the labels are clustered in their own little group. So, if we’re to use a clustering algorithm to generate clusters using the new features/components, we can accurately assign new points to a label.
Let’s quickly summarize the topics we covered. We started with the generation of heatmaps using random numbers and extended its application to a real-world example. Next, we implemented choropleth graphs to visualize the data points with respect to geographical locations. We moved on to implement surface plots to get an idea of how we can visualize the data in a three-dimensional surface. Finally, we used two- dimensional reduction techniques, PCA and t-SNE, to visualize high-dimensional datasets.
I encourage you to implement the examples described in this article to get a hands-on experience. Hope you enjoyed the article. Do let me know if you have any feedback, suggestions, or thoughts on this article in the comments below!
The post Data Visualization for Beginners-Part 3 appeared first on HackerEarth Blog.
]]>The post Composing Jazz Music with Deep Learning appeared first on HackerEarth Blog.
]]>Deep Learning is on the rise, extending its application in every field, ranging from computer vision to natural language processing, healthcare, speech recognition, generating art, addition of sound to silent movies, machine translation, advertising, self-driving cars, etc. In this blog, we will extend the power of deep learning to the domain of music production. We will talk about how we can use deep learning to generate new musical beats.
The current technological advancements have transformed the way we produce music, listen, and work with music. With the advent of deep learning, it has now become possible to generate music without the need for working with instruments artists may not have had access to or the skills to use previously. This offers artists more creative freedom and ability to explore different domains of music.
Since music is a sequence of notes and chords, it doesn’t have a fixed dimensionality. Traditional deep neural network techniques cannot be applied to generate music as they assume the inputs and targets/outputs to have fixed dimensionality and outputs to be independent of each other. It is therefore clear that a domain-independent method that learns to map sequences to sequences would be useful.
Recurrent neural networks (RNNs) are a class of artificial neural networks that make use of sequential information present in the data.
A recurrent neural network has looped, or recurrent, connections which allow the network to hold information across inputs. These connections can be thought of as memory cells. In other words, RNNs can make use of information learned in the previous time step. As seen in Fig. 1, the output of the previous hidden/activation layer is fed into the next hidden layer. Such an architecture is efficient in learning sequence-based data.
In this blog, we will be using the Long Short-Term Memory (LSTM) architecture. LSTM is a type of recurrent neural network (proposed by Hochreiter and Schmidhuber, 1997) that can remember a piece of information and keep it saved for many timesteps.
Our dataset includes piano tunes stored in the MIDI format. MIDI (Musical Instrument Digital Interface) is a protocol which allows electronic instruments and other digital musical tools to communicate with each other. Since a MIDI file only represents player information, i.e., a series of messages like ‘note on’, ‘note off, it is more compact, easy to modify, and can be adapted to any instrument.
Before we move forward, let us understand some music related terminologies:
We will use the music21 toolkit (a toolkit for computer-aided musicology, MIT) to extract data from these MIDI files.
def get_notes():
notes = []
for file in songs:
# converting .mid file to stream object
midi = converter.parse(file)
notes_to_parse = []
try:
# Given a single stream, partition into a part for each unique instrument
parts = instrument.partitionByInstrument(midi)
except:
pass
if parts: # if parts has instrument parts
notes_to_parse = parts.parts[0].recurse()
else:
notes_to_parse = midi.flat.notes
for element in notes_to_parse:
if isinstance(element, note.Note):
# if element is a note, extract pitch
notes.append(str(element.pitch))
elif(isinstance(element, chord.Chord)):
# if element is a chord, append the normal form of the
# chord (a list of integers) to the list of notes.
notes.append('.'.join(str(n) for n in element.normalOrder))
with open('data/notes', 'wb') as filepath:
pickle.dump(notes, filepath)
return notes
The function get_notes returns a list of notes and chords present in the .mid file. We use the converter.parse function to convert the midi file in a stream object, which in turn is used to extract notes and chords present in the file. The list returned by the function get_notes() looks as follows:
Out:
['F2', '4.5.7', '9.0', 'C3', '5.7.9', '7.0', 'E4', '4.5.8', '4.8', '4.8', '4', 'G#3',
'D4', 'G#3', 'C4', '4', 'B3', 'A2', 'E3', 'A3', '0.4', 'D4', '7.11', 'E3', '0.4.7', 'B4', 'C3', 'G3', 'C4', '4.7', '11.2', 'C3', 'C4', '11.2.4', 'G4', 'F2', 'C3', '0.5', '9.0', '4.7', 'F2', '4.5.7.9.0', '4.8', 'F4', '4', '4.8', '2.4', 'G#3',
'8.0', 'E2', 'E3', 'B3', 'A2', '4.9', '0.4', '7.11', 'A2', '9.0.4', ...........]
We can see that the list consists of pitches and chords (represented as a list of integers separated by a dot). We assume each new chord to be a new pitch on the list. As letters are used to generate words in a sentence, similarly the music vocabulary used to generate music is defined by the unique pitches in the notes list.
A neural network accepts only real values as input and since the pitches in the notes list are in string format, we need to map each pitch in the notes list to an integer. We can do so as follows:
# Extract the unique pitches in the list of notes.
pitchnames = sorted(set(item for item in notes))
# create a dictionary to map pitches to integers
note_to_int = dict((note, number) for number, note in enumerate(pitchnames))
Next, we will create an array of input and output sequences to train our model. Each input sequence will consist of 100 notes, while the output array stores the 101st note for the corresponding input sequence. So, the objective of the model will be to predict the 101st note of the input sequence of notes.
# create input sequences and the corresponding outputs
for i in range(0, len(notes) - sequence_length, 1):
sequence_in = notes[i: i + sequence_length]
sequence_out = notes[i + sequence_length]
network_input.append([note_to_int[char] for char in sequence_in])
network_output.append(note_to_int[sequence_out])
Next, we reshape and normalize the input vector sequence before feeding it to the model. Finally, we one-hot encode our output vector.
n_patterns = len(network_input)
# reshape the input into a format compatible with LSTM layers
network_input = np.reshape(network_input, (n_patterns, sequence_length, 1))
# normalize input
network_input = network_input / float(n_vocab)
# One hot encode the output vector
network_output = np_utils.to_categorical(network_output)
We will use keras to build our model architecture. We use a character level-based architecture to train the model. So each input note in the music file is used to predict the next note in the file, i.e., each LSTM cell takes the previous layer activation (a^{⟨t−1⟩}) and the previous layers actual output (y^{⟨t−1⟩}) as input at the current time step tt. This is depicted in the following figure (Fig 2.).
Our model architecture is defined as:
model = Sequential()
model.add(LSTM(128, input_shape=network_in.shape[1:], return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM(128, return_sequences=True))
model.add(Flatten())
model.add(Dense(256))
model.add(Dropout(0.3))
model.add(Dense(n_vocab))
model.add(Activation('softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam')
Our music model consists of two LSTM layers with each layer consisting of 128 hidden layers. We use ‘categorical cross entropy‘ as the loss function and ‘adam‘ as the optimizer. Fig. 3 shows the model summary.
To train the model, we call the model.fit function with the input and output sequences as the input to the function. We also create a model checkpoint which saves the best model weights.
from keras.callbacks import ModelCheckpoint
def train(model, network_input, network_output, epochs):
"""
Train the neural network
"""
filepath = 'weights.best.music3.hdf5'
checkpoint = ModelCheckpoint(filepath, monitor='loss', verbose=0, save_best_only=True)
model.fit(network_input, network_output, epochs=epochs, batch_size=32, callbacks=[checkpoint])
def train_network():
epochs = 200
notes = get_notes()
print('Notes processed')
n_vocab = len(set(notes))
print('Vocab generated')
network_in, network_out = prepare_sequences(notes, n_vocab)
print('Input and Output processed')
model = create_network(network_in, n_vocab)
print('Model created')
return model
print('Training in progress')
train(model, network_in, network_out, epochs)
print('Training completed')
The train_network method gets the notes, creates the input and output sequences, creates a model, and trains the model for 200 epochs.
Now that we have trained our model, we can use it to generate some new notes. To generate new notes, we need a starting note. So, we randomly pick an integer and pick a random sequence from the input sequence as a starting point.
def generate_notes(model, network_input, pitchnames, n_vocab):
""" Generate notes from the neural network based on a sequence of notes """
# Pick a random integer
start = np.random.randint(0, len(network_input)-1)
int_to_note = dict((number, note) for number, note in enumerate(pitchnames))
# pick a random sequence from the input as a starting point for the prediction
pattern = network_input[start]
prediction_output = []
print('Generating notes........')
# generate 500 notes
for note_index in range(500):
prediction_input = np.reshape(pattern, (1, len(pattern), 1))
prediction_input = prediction_input / float(n_vocab)
prediction = model.predict(prediction_input, verbose=0)
# Predicted output is the argmax(P(h|D))
index = np.argmax(prediction)
# Mapping the predicted interger back to the corresponding note
result = int_to_note[index]
# Storing the predicted output
prediction_output.append(result)
pattern.append(index)
# Next input to the model
pattern = pattern[1:len(pattern)]
print('Notes Generated...')
return prediction_output
Next, we use the trained model to predict the next 500 notes. At each time step, the output of the previous layer (ŷ^{⟨t−1⟩}) is provided as input (x^{⟨t⟩}) to the LSTM layer at the current time step t. This is depicted in the following figure (see Fig. 4).
Since the predicted output is an array of probabilities, we choose the output at the index with the maximum probability. Finally, we map this index to the actual note and add this to the list of predicted output. Since the predicted output is a list of strings of notes and chords, we cannot play it. Hence, we encode the predicted output into the MIDI format using the create_midi method.
### Converts the predicted output to midi format
create_midi(prediction_output)
To create some new jazz music, you can simply call the generate() method, which calls all the related methods and saves the predicted output as a MIDI file.
#### Generate a new jazz music
generate()
Out:
Initiating music generation process.......
Loading Model weights.....
Model Loaded
Generating notes........
Notes Generated...
Saving Output file as midi....
To play the generated MIDI in the Jupyter Notebook you can import the play_midi method from the play.py file or use an external MIDI player or convert the MIDI file to the mp3. Let’s listen to our generated jazz piano music.
### Play the Jazz music
play.play_midi('test_output3.mid')
Generated Track 1
Congratulations! You can now generate your own jazz music. You can find the full code in this Github repository. I encourage you to play with the parameters of the model and train the model with input sequences of different sequence lengths. Try to implement the code for some other instrument (such as guitar). Furthermore, such a character-based model can also be applied to a text corpus to generate sample texts, such as a poem.
Also, you can showcase your own personal composer and any similar idea in the World Music Hackathon by HackerEarth.
Have anything to say? Feel free to comment below for any questions, suggestions, and discussions related to this article. Till then, happy coding.
The post Composing Jazz Music with Deep Learning appeared first on HackerEarth Blog.
]]>The post Data visualization for beginners – Part 2 appeared first on HackerEarth Blog.
]]>Welcome to Part II of the series on data visualization. In the last blog post, we explored different ways to visualize continuous variables and infer information. If you haven’t visited that article, you can find it here. In this blog, we will expand our exploration to categorical variables and investigate ways in which we can visualize and gain insights from them, in isolation and in combination with variables (both categorical and continuous).
Before we dive into the different graphs and plots, let’s define a categorical variable. In statistics, a categorical variable is one which has two or more categories, but there is no intrinsic ordering to them, for example, gender, color, cities, age group, etc. If there is some kind of ordering between the categories, the variables are classified as ordinal variables, for example, if you categorize car prices by cheap, moderate and expensive. Although these are categories, there is a clear ordering between the categories.
# Importing the necessary libraries.
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline
We will be using the Adult data set, which is an extraction of the 1994 census dataset. The prediction task is to determine whether a person makes more than 50K a year. Here is the link to the dataset. In this blog, we will be using the dataset only for data analysis.
# Since the dataset doesn't contain the column header, we need to specify it manually.
cols = ['age', 'workclass', 'fnlwgt', 'education', 'education-num', 'marital-status', 'occupation', 'relationship', 'race', 'gender', 'capital-gain', 'capital-loss', 'hours-per-week', 'native-country', 'annual-income']
# Importing dataset
data = pd.read_csv('adult dataset/adult.data', names=cols)
# The first five columns of the dataset.
data.head()
A bar chart or graph is a graph with rectangular bars or bins that are used to plot categorical values. Each bar in the graph represents a categorical variable and the height of the bar is proportional to the value represented by it.
Bar graphs are used:
# Let's start by visualizing the distribution of gender in the dataset.
fig, ax = plt.subplots()
x = data.gender.unique()
# Counting 'Males' and 'Females' in the dataset
y = data.gender.value_counts()
# Plotting the bar graph
ax.bar(x, y)
ax.set_xlabel('Gender')
ax.set_ylabel('Count')
plt.show()
From the figure, we can infer that there are more number of males than females in the dataset. Next, we will use the bar graph to visualize the distribution of annual income based on both gender and hours per week (i.e. the number of hours they work per week).
# For this plot, we will be using the seaborn library as it provides more flexibility with dataframes.
sns.barplot(data.gender, data['hours-per-week'], hue=data['annual-income'])
plt.show()
So from the figure above, we can infer that males and females with annual income less than 50K tend to work more per week.
This is a seaborn-specific function which is used to plot the count or frequency distribution of each unique observation in the categorical variable. It is similar to a histogram over a categorical rather than quantitative variable.
So, let’s plot the number of males and females in the dataset using the countplot function.
# Using Countplot to count number of males and females in the dataset.
sns.countplot(data.gender)
plt.show()
Earlier, we plotted the same thing using a bar graph, and it required some external calculations on our part to do so. But we can do the same thing using the countplot function in just a single line of code. Next, we will see how we can use countplot for deeper insights.
# ‘hue’ is used to visualize the effect of an additional variable to the current distribution.
sns.countplot(data.gender, hue=data['annual-income'])
plt.show()
From the figure above, we can count that number of males and females whose annual income is <=50 and > 50K. We can see that the approximate number of
So, we can infer that out of 32,500 (approx) people, only 8000 people have income greater than 50K, out of which only 1000 of them are females.
Box plots are widely used in data visualization. Box plots, also known as box and whisker plots are used to visualize variations and compare different categories in a given set of data. It doesn’t display the distribution in detail but is useful in detecting whether a distribution is skewed and detect outliers in the data. In a box and whisker plot:
Let’s use a box and whisker plot to find a correlation between ‘hours-per-week’ and ‘relationship’ based on their annual income.
# Creating a box plot
fig, ax = plt.subplots(figsize=(15, 8))
sns.boxplot(x='relationship', y='hours-per-week', hue='annual-income', data=data, ax=ax)
ax.set_title('Annual Income of people based on relationship and hours-per-week')
plt.show()
We can interpret some interesting results from the box plot. People with the same relationship status and an annual income more than 50K often work for more hours per week. Similarly, we can also infer that people who have a child and earn less than 50K tend to have more flexible working hours.
Apart from this, we can also detect outliers in the data. For example, people with relationship status ‘Not in family’ (see Fig 6.) and an income less than 50K have a large number of outliers at both the high and low ends. This also seems to be logically correct as a person who earns less than 50K annually may work more or less depending on the type of job and employment status.
Strip plot is a data analysis technique used to plot the sorted values of a variable along one axis. It is used to represent the distribution of a continuous variable with respect to the different levels of a categorical variable. For example, a strip plot can be used to show the distribution of the variable ‘gender’, i.e., males and females, with respect to the number of hours they work each week. A strip plot is also a good complement to a box plot or a violin plot in cases where you want to showcase all the observations along with some representation of the underlying distribution.
# Using Strip plot to visualize the data.
fig, ax= plt.subplots(figsize=(10, 8))
sns.stripplot(data['annual-income'], data['hours-per-week'], jitter=True, ax=ax)
ax.set_title('Strip plot')
plt.show()
In the figure, by looking at the distribution of the data points, we can deduce that most of the people with an annual income greater than 50K work between 40 and 60 hours per week. While those with income less than 50K work can work between 0 and 60 hours per week.
Sometimes the mean and median may not be enough to understand the distribution of the variable in the dataset. The data may be clustered around the maximum or minimum with nothing in the middle. Box plots are a great way to summarize the statistical information related to the distribution of the data (through the interquartile range, mean, median), but they cannot be used to visualize the variations in the distributions.
A violin plot is a combination of a box plot and kernel density function (KDE, described in Part I of this blog series) which can be used to visualize the probability distribution of the data. Violin plots can be interpreted as follows:
Let’s now build a violin plot. To start with, we will analyze the distribution of annual income of the people w.r.t. the number of hours they work per week.
fig, ax = plt.subplots(figsize=(10, 8))
sns.violinplot(x='annual-income', y='hours-per-week', data=data, ax=ax)
ax.set_title('Violin plot')
plt.show()
In Fig 9, the median number working hours per week is same (40 approximately) for both people earning less than 50K and greater than 50K. Although people earning less than 50K can have a varied range of the hours they spend working per week, most of the people who earn more than 50K work in the range of 40 – 80 hours per week.
Next, we can visualize the same distribution, but this grouping them according to their gender.
# Violin plot
fig, ax = plt.subplots(figsize=(10, 8))
sns.violinplot(x='annual-income', y='hours-per-week', hue='gender', data=data, ax=ax)
ax.set_title('Violin plot grouped according to gender')
plt.show()
Adding the variable ‘gender’, gives us insights into how much each gender spends working per week based upon their annual income. From the figure, we can infer that males with annual income less than 50K tends to spend more hours working per week than females. But for people earning greater than 50K, both males and females spend an equal amount of hours per week working.
Violin plots, although more informative, are less frequently used in data visualization. It may be because they are hard to grasp and understand at first glance. But their ability to represent the variations in the data are making them popular among machine learning and data enthusiasts.
PairGrid is used to plot the pairwise relationship of all the variables in a dataset. This may seem to be similar to the pairplot we discussed in part I of this series. The difference is that instead of plotting all the plots automatically, as in the case of pairplot, Pair Grid creates a class instance, allowing us to map specific functions to the different sections of the grid.
Let’s start by defining the class.
# Creating an instance of the pair grid plot.
g = sns.PairGrid(data=data, hue='annual-income')
The variable ‘g’ here is a class instance. If we were to display ‘g’, then we will get a grid of empty plots. There are four grid sections to fill in a Pair Grid: upper triangle, lower triangle, the diagonal, and off-diagonal. To fill all the sections with the same plot, we can simply call ‘g.map’ with the type of plot and plot parameters.
# Creating a scatter plots for all pairs of variables.
g = sns.PairGrid(data=data, hue='capital-gain')
g.map(plt.scatter)
The ‘g.map_lower’ method only fills the lower triangle of the grid while the ‘g.map_upper’ method only fills the upper triangle of the grid. Similarly, ‘g.map_diag’ and ‘g.map_offdiag’ fills the diagonal and off-diagonal of the grid, respectively.
#Here we plot scatter plot, histogram and violin plot using Pair grid.
g = sns.PairGrid(data=data, vars = ['age', 'education-num', 'hours-per-week'])
# with the help of the vars parameter we can select the variables between which we want the plot to be constructed.
g.map_lower(plt.scatter, color='red')
g.map_diag(plt.hist, bins=15)
g.map_upper(sns.violinplot)
Thus with the help of Pair Grid, we can visualize the relationship between the three variables (‘hours-per-week’, ‘education-num’ and ‘age’) using three different plots all in the same figure. Pair grid comes in handy when visualizing multiple plots in the same figure.
Let’s summarize what we learned. So, we started with visualizing the distribution of categorical variables in isolation. Then, we moved on to visualize the relationship between a categorical and a continuous variable. Finally, we explored visualizing relationships when more than two variables are involved. Next week, we will explore how we can visualize unstructured data. Finally, I encourage you to download the given census data (used in this blog) or any other dataset of your choice and play with all the variations of the plots learned in this blog. Till then, Adiós!
The post Data visualization for beginners – Part 2 appeared first on HackerEarth Blog.
]]>The post Data visualization for beginners – Part 1 appeared first on HackerEarth Blog.
]]>This is a series of blogs dedicated to different data visualization techniques used in various domains of machine learning. Data Visualization is a critical step for building a powerful and efficient machine learning model. It helps us to better understand the data, generate better insights for feature engineering, and, finally, make better decisions during modeling and training of the model.
For this blog, we will use the seaborn and matplotlib libraries to generate the visualizations. Matplotlib is a MATLAB-like plotting framework in python, while seaborn is a python visualization library based on matplotlib. It provides a high-level interface for producing statistical graphics. In this blog, we will explore different statistical graphical techniques that can help us in effectively interpreting and understanding the data. Although all the plots using the seaborn library can be built using the matplotlib library, we usually prefer the seaborn library because of its ability to handle DataFrames.
We will start by importing the two libraries. Here is the guide to installing the matplotlib library and seaborn library. (Note that I’ll be using matplotlib and seaborn libraries interchangeably depending on the plot.)
### Importing necessary library
import random
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline
Let’s begin by plotting a simple line plot which is used to plot a mathematical. A line plot is used to plot the relationship or dependence of one variable on another. Say, we have two variables ‘x’ and ‘y’ with the following values:
x = np.array([ 0, 0.53, 1.05, 1.58, 2.11, 2.63, 3.16, 3.68, 4.21,
4.74, 5.26, 5.79, 6.32, 6.84])
y = np.array([ 0, 0.51, 0.87, 1. , 0.86, 0.49, -0.02, -0.51, -0.88,
-1. , -0.85, -0.47, 0.04, 0.53])
To plot the relationship between the two variables, we can simply call the plot function.
### Creating a figure to plot the graph.
fig, ax = plt.subplots()
ax.plot(x, y)
ax.set_xlabel('X data')
ax.set_ylabel('Y data')
ax.set_title('Relationship between variables X and Y')
plt.show() # display the graph
### if %matplotlib inline has been invoked already, then plt.show() is automatically invoked and the plot is displayed in the same window.
Here, we can see that the variables ‘x’ and ‘y’ have a sinusoidal relationship. Generally, .plot() function is used to find any mathematical relationship between the variables.
A histogram is one of the most frequently used data visualization techniques in machine learning. It represents the distribution of a continuous variable over a given interval or period of time. Histograms plot the data by dividing it into intervals called ‘bins’. It is used to inspect the underlying frequency distribution (eg. Normal distribution), outliers, skewness, etc.
Let’s assume some data ‘x’ and analyze its distribution and other related features.
### Let 'x' be the data with 1000 random points.
x = np.random.randn(1000)
Let’s plot a histogram to analyze the distribution of ‘x’.
plt.hist(x)
plt.xlabel('Intervals')
plt.ylabel('Value')
plt.title('Distribution of the variable x')
plt.show()
The above plot shows a normal distribution, i.e., the variable ‘x’ is normally distributed. We can also infer that the distribution is somewhat negatively skewed. We usually control the ‘bins’ parameters to produce a distribution with smooth boundaries. For example, if we set the number of ‘bins’ too low, say bins=5, then most of the values get accumulated in the same interval, and as a result they produce a distribution which is hard to predict.
plt.hist(x, bins=5)
plt.xlabel('Intervals')
plt.ylabel('Value')
plt.title('Distribution of the variable x')
plt.show()
Similarly, if we increase the number of ‘bins’ to a high value, say bins=1000, each value will act as a separate bin, and as a result the distribution seems to be too random.
plt.hist(x, bins=1000)
plt.xlabel('Intervals')
plt.ylabel('Value')
plt.title('Distribution of the variable x')
plt.show()
Before we dive into understanding KDE, let’s understand what parametric and non-parametric data are.
Parametric Data: When the data is assumed to have been drawn from a particular distribution and some parametric test can be applied to it
Non-Parametric Data: When we have no knowledge about the population and the underlying distribution
Kernel Density Function is the non-parametric way of representing the probability distribution function of a random variable. It is used when the parametric distribution of the data doesn’t make much sense, and you want to avoid making assumptions about the data.
The kernel density estimator is the estimated pdf of a random variable. It is defined as
Similar to histograms, KDE plots the density of observations on one axis with height along the other axis.
### We will use the seaborn library to plot KDE.
### Let's assume random data stored in variable 'x'.
fig, ax = plt.subplots()
### Generating random data
x = np.random.rand(200)
sns.kdeplot(x, shade=True, ax=ax)
plt.show()
Distplot combines the function of the histogram and the KDE plot into one figure.
### Generating a random sample
x = np.random.random_sample(1000)
### Plotting the distplot
sns.distplot(x, bins=20)
So, the distplot function plots the histogram and the KDE for the sample data in the same figure. You can tune the parameters of the displot to only display the histogram or kde or both. Distplot comes in handy when you want to visualize how close your assumption about the distribution of the data is to the actual distribution.
Scatter plots are used to determine the relationship between two variables. They show how much one variable is affected by another. It is the most commonly used data visualization technique and helps in drawing useful insights when comparing two variables. The relationship between two variables is called correlation. If the data points fit a line or curve with a positive slope, then the two variables are said to show positive correlation. If the line or curve has a negative slope, then the variables are said to have a negative correlation.
A perfect positive correlation has a value of 1 and a perfect negative correlation has a value of -1. The closer the value is to 1 or -1, the stronger the relationship between the variables. The closer the value is to 0, the weaker the correlation.
For our example, let’s define three variables ‘x’, ‘y’, and ‘z’, where ‘x’ and ‘z’ are randomly generated data and ‘y’ is defined as
We will use a scatter plot to find the relationship between the variables ‘x’ and ‘y’.
### Let's define the variables we want to find the relationship between.
x = np.random.rand(500)
z = np.random.rand(500)
### Defining the variable 'y'
y = x * (z + x)
fig, ax = plt.subplots()
ax.set_xlabel('X')
ax.set_ylabel('Y')
ax.set_title('Scatter plot between X and Y')
plt.scatter(x, y, marker='.')
plt.show()
From the figure above we can see that the data points are very close to each other and also if we fit a curve, along with the points, it will have a positive slope. Therefore, we can infer that there is a strong positive correlation between the values of the variable ‘x’ and variable ‘y’.
Also, we can see that the curve that best fits the graph is quadratic in nature and this can be confirmed by looking at the definition of the variable ‘y’.
Jointplot is seaborn library specific and can be used to quickly visualize and analyze the relationship between two variables and describe their individual distributions on the same plot.
Let’s start with using joint plot for producing the scatter plot.
### Defining the data.
mean, covar = [0, 1], [[1, 0,], [0, 50]]
### Drawing random samples from a multivariate normal distribution.
### Two random variables are created, each containing 500 values, with the given mean and covariance.
data = np.random.multivariate_normal(mean, covar, 500)
### Storing the variables in a dataframe.
df = pd.DataFrame(data=data, columns=['X', 'Y'])
### Joint plot between X and Y
sns.jointplot(df.X, df.Y, kind='scatter')
plt.show()
Next, we can use the joint point to find the best line or curve that fits the plot.
sns.jointplot(df.X, df.Y, kind='reg')
plt.show()
Apart from this, jointplot can also be used to plot ‘kde’, ‘hex plot’, and ‘residual plot’.
We can use scatter plot to plot the relationship between two variables. But what if the dataset has more than two variables (which is quite often the case), it can be a tedious task to visualize the relationship between each variable with the other variables.
The seaborn pairplot function does the same thing for us and in just one line of code. It is used to plot multiple pairwise bivariate (two variable) distribution in a dataset. It creates a matrix and plots the relationship for each pair of columns. It also draws a univariate distribution for each variable on the diagonal axes.
### Loading a dataset from the sklearn toy datasets
from sklearn.datasets import load_linnerud
### Loading the data
linnerud_data = load_linnerud()
### Extracting the column data
data = linnerud_data.data
Sklearn stores data in the form of a numpy array and not data frames, thereby storing the data in a dataframe.
### Creating a dataframe
data = pd.DataFrame(data=data, columns=diabetes_data.feature_names)
### Plotting a pairplot
sns.pairplot(data=data)
So, in the graph above, we can see the relationships between each of the variables with the other and thus infer which variables are most correlated.
Visualizations play an important role in data analysis and exploration. In this blog, we got introduced to different kinds of plots used for data analysis of continuous variables. Next week, we will explore the various data visualization techniques that can be applied to categorical variables or variables with discrete values. Next, I encourage you to download the iris dataset or any other dataset of your choice and apply and explore the techniques learned in this blog.
Have anything to say? Feel free to comment below for any questions, suggestions, and discussions related to this article. Till then, Sayōnara.
The post Data visualization for beginners – Part 1 appeared first on HackerEarth Blog.
]]>The post 11 open source frameworks for AI and machine learning models appeared first on HackerEarth Blog.
]]>The meteoric rise of artificial intelligence in the last decade has spurred a huge demand for AI and ML skills in today’s job market. ML-based technology is now used in almost every industry vertical from finance to healthcare. In this blog, we have compiled a list of best frameworks and libraries that you can use to build machine learning models.
1) TensorFlow
Developed by Google, TensorFlow is an open-source software library built for deep learning or artificial neural networks. With TensorFlow, you can create neural networks and computation models using flowgraphs. It is one of the most well-maintained and popular open-source libraries available for deep learning. The TensorFlow framework is available in C++ and Python. Other similar deep learning frameworks that are based on Python include Theano, Torch, Lasagne, Blocks, MXNet, PyTorch, and Caffe. You can use TensorBoard for easy visualization and see the computation pipeline. Its flexible architecture allows you to deploy easily on different kinds of devices
On the negative side, TensorFlow does not have symbolic loops and does not support distributed learning. Further, it does not support Windows.
2)Theano
Theano is a Python library designed for deep learning. Using the tool, you can define and evaluate mathematical expressions including multi-dimensional arrays. Optimized for GPU, the tool comes with features including integration with NumPy, dynamic C code generation, and symbolic differentiation. However, to get a high level of abstraction, the tool will have to be used with other libraries such as Keras, Lasagne, and Blocks. The tool supports platforms such as Linux, Mac OS X, and Windows.
3) Torch
The Torch is an easy to use open-source computing framework for ML algorithms. The tool offers an efficient GPU support, N-dimensional array, numeric optimization routines, linear algebra routines, and routines for indexing, slicing, and transposing. Based on a scripting language called Lua, the tool comes with an ample number of pre-trained models. This flexible and efficient ML research tool supports major platforms such as Linux, Android, Mac OS X, iOS, and Windows.
4) Caffe
Caffe is a popular deep learning tool designed for building apps. Created by Yangqing Jia for a project during his Ph.D. at UC Berkeley, the tool has a good Matlab/C++/ Python interface. The tool allows you to quickly apply neural networks to the problem using text, without writing code. Caffe partially supports multi-GPU training. The tool supports operating systems such as Ubuntu, Mac OS X, and Windows.
5) Microsoft CNTK
Microsoft cognitive toolkit is one of the fastest deep learning frameworks with C#/C++/Python interface support. The open-source framework comes with powerful C++ API and is faster and more accurate than TensorFlow. The tool also supports distributed learning with built-in data readers. It supports algorithms such as Feed Forward, CNN, RNN, LSTM, and Sequence-to-Sequence. The tool supports Windows and Linux.
6) Keras
Written in Python, Keras is an open-source library designed to make the creation of new DL models easy. This high-level neural network API can be run on top of deep learning frameworks like TensorFlow, Microsoft CNTK, etc. Known for its user-friendliness and modularity, the tool is ideal for fast prototyping. The tool is optimized for both CPU and GPU.
7) SciKit-Learn
SciKit-Learn is an open-source Python library designed for machine learning. The tool based on libraries such as NumPy, SciPy, and matplotlib can be used for data mining and data analysis. SciKit-Learn is equipped with a variety of ML models including linear and logistic regressors, SVM classifiers, and random forests. The tool can be used for multiple ML tasks such as classification, regression, and clustering. The tool supports operating systems like Windows and Linux. On the downside, it is not very efficient with GPU.
8)Accord.NET
Written in C#, Accord.NET is an ML framework designed for building production-grade computer vision, computer audition, signal processing and statistics applications. It is a well-documented ML framework that makes audio and image processing easy. The tool can be used for numerical optimization, artificial neural networks, and visualization. It supports Windows.
9)Spark MLIib
Apache Spark’s MLIib is an ML library that can be used in Java, Scala, Python, and R. Designed for processing large-scale data, this powerful library comes with many algorithms and utilities such as classification, regression, and clustering. The tool interoperates with NumPy in Python and R libraries. It can be easily plugged into Hadoop workflows.
10) Azure ML Studio
Azure ML studio is a modern cloud platform for data scientists. It can be used to develop ML models in the cloud. With a wide range of modeling options and algorithms, Azure is ideal for building larger ML models. The service provides 10GB of storage space per account. It can be used with R and Python programs.
11) Amazon Machine Learning
Amazon Machine Learning (AML) is an ML service that provides tools and wizards for creating ML models. With visual aids and easy-to-use analytics, AML aims to make ML more accessible to developers. AML can be connected to data stored in Amazon S3, Redshift, or RDS.
Machine learning frameworks come with pre-built components that are easy to understand and code. A good ML framework thus reduces the complexity of defining ML models. With these open-source ML frameworks, you build your ML models easily and quickly.
Know an ML framework that should be on this list? Share them in comments below.
The post 11 open source frameworks for AI and machine learning models appeared first on HackerEarth Blog.
]]>The post Leverage machine learning to amplify your social impact appeared first on HackerEarth Blog.
]]>“Data is abundant and cheap but knowledge is scarce and expensive.”
In the last few years, there has been a data revolution that has transformed the way we source, capture, and interact with data. From fortune 500 firms to start-ups, healthcare to fintech, machine learning and data science have become an integral part of everyday operations of most companies. Of all the sectors, the social good sector has not seen the push the other sectors have. It is not that the machine learning and data science techniques don’t work for this sector, but the lack of financial support and staff has stopped them from creating their special brand of magic here.
At HackerEarth, we intend to tackle this issue by sponsoring machine learning and data science challenges for social good.
Even though machine learning (ML) is a new wing at HackerEarth, this is the fastest growing unit in the company. Also, over the past year, we have grown to a community of 200K+ machine learning and data science enthusiasts. We have conducted 50+ challenges across sectors with an average of 6500+ people participating in each.
The “Machine Learning Challenges for Social Good” initiative is focused toward bringing interesting real-world data problems faced by nonprofits and governmental and non-governmental organizations to the machine learning and data science community’s notice. This is a win-win for both communities because the nonprofits and governmental and non-governmental organizations get their challenges addressed, and the machine learning and data science community gets to hone their skills while being agents of change.
HackerEarth will contribute by
Are you a nonprofit or a governmental/non-governmental organization with a business/social problem for which primary or secondary data is available? If yes, please mail us at social@hackerearth.com. [Please use subject line “Reg: Machine Learning for Social Good | {Your Organization Name}]
The post Leverage machine learning to amplify your social impact appeared first on HackerEarth Blog.
]]>The post Artificial Intelligence 101: How to get started appeared first on HackerEarth Blog.
]]>What is Artificial Intelligence (AI)?
Are you thinking of Chappie, Terminator, and Lucy? Sentient, self-aware robots are closer to becoming a reality than you think. Developing computer systems that equal or exceed human intelligence is the crux of artificial intelligence. Artificial Intelligence (AI) is the study of computer science focusing on developing software or machines that exhibit human intelligence. A simple enough definition, right?
Obviously, there is a lot more to it. AI is a broad topic ranging from simple calculators to self-steering technology to something that might radically change the future.
Goals and Applications of AI
The primary goals of AI include deduction and reasoning, knowledge representation, planning, natural language processing (NLP), learning, perception, and the ability to manipulate and move objects. Long-term goals of AI research include achieving Creativity, Social Intelligence, and General (human level) Intelligence.
AI has heavily influenced different sectors that we may not recognize. Ray Kurzweil says “Many thousands of AI applications are deeply embedded in the infrastructure of every industry.” John McCarthy, one of the founders of AI, once said that “as soon as it works, no one calls it AI anymore.”
Broadly, AI is classified into the following:
Source: Bluenotes
Types of AI
While there are various forms of AI as it’s a broad concept, we can divide it into the following three categories based on AI’s capabilities:
Weak AI, which is also referred to as Narrow AI, focuses on one task. There is no self-awareness or genuine intelligence in case of a weak AI.
iOS Siri is a good example of a weak AI combining several weak AI techniques to function. It can do a lot of things for the user, and you’ll see how “narrow” it exactly is when you try having conversations with the virtual assistant.
Strong AI, which is also referred to as True AI, is a computer that is as smart as the human brain. This sort of AI will be able to perform all tasks that a human could do. There is a lot of research going on in this field, but we still have much to do. You should be imagining Matrix or I, Robot here.
Artificial Superintelligence is going to blow your mind if Strong AI impressed you. Nick Bostrom, leading AI thinker, defines it as “an intellect that is much smarter than the best human brains in practically every field, including scientific creativity, general wisdom and social skills.”
Artificial Superintelligence is the reason why many prominent scientists and technologists, including Stephen Hawking and Elon Musk, have raised concerns about the possibility of human extinction.
How can you get started?
The first thing you need to do is learn a programming language. Though there are a lot of languages that you can start with, Python is what many prefer to start with because its libraries are better suited to Machine Learning.
Here are some good resources for Python:
Introduction to Bots
A BOT is the most basic example of a weak AI that can do automated tasks on your behalf. Chatbots were one of the first automated programs to be called “bots.” You need AI and ML for your chatbots. Web crawlers used by Search Engines like Google are a perfect example of a sophisticated and advanced BOT.
You should learn the following before you start programming bots to make your life easier.
How can you build your first bot?
You can start learning how to create bots in Python through the following tutorial in the simplest way.
You can also start by using APIs and tools that offer the ability to build end-user applications. This helps you by actually building something without worrying too much about the theory at first. Some of the APIs that you can use for this are:
Here’s a listing of a few BOT problems for you to practice and try out before you attempt the ultimate challenge.
What now?
Once you have a thorough understanding of your preferred programming language and enough practice with the basics, you should start to learn more about Machine Learning. In Python, start learning Scikit-learn, NLTK, SciPy, PyBrain, and Numpy libraries which will be useful while writing Machine Learning algorithms.You need to know Advanced Math and as well.
Here is a list of resources for you to learn and practice:
Here are a few more valuable links:
You should also take part in various AI and BOT Programming Contests at different places on the Internet:
Before you start learning and contributing to the field of AI, read how AI is rapidly changing the world.
The post Artificial Intelligence 101: How to get started appeared first on HackerEarth Blog.
]]>