Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Messages - s.arman

Pages: 1 [2] 3 4 ... 18
5 Common types of Bias
1- Sample bias
Happens when the collected data doesn’t accurately represent the environment the program is expected to run into.
There is no algorithm that can be trained on the entire universe of data, rather than a subset that is carefully chosen.

There’s a science of choosing this subset that is both large enough and representative enough to mitigate sample bias.

Example: Security cameras
If your goal is to create a model that can operate security cameras at daytime and nighttime, but train it on nighttime data only. You’ve introduced sample bias into your model.

Sample bias can be reduced or eliminated by:

Training your model on both daytime and nighttime.
Covering all the cases you expect your model to be exposed to. This can be done by examining the domain of each feature and make sure we have balanced evenly-distributed data covering all of it. Otherwise, you’ll be faced by erroneous results and outputs the don’t make sense will be produced.
2- Exclusion bias
Happens as a result of excluding some feature(s) from our dataset usually under the umbrella of cleaning our data.
We delete some feature(s) thinking that they’re irrelevant to our labels/outputs based on pre-existing beliefs.

Example: Titanic Survival prediction
In the famous titanic problem where we predict who survived and who didn’t. One might disregard the passenger id of the travelers as they might think that it is completely irrelevant to whether they survived or not.
Little did they know that Titanic passengers were assigned rooms according to their passenger id. The smaller the id number the closer their assigned rooms are to the lifeboats which made those people able to get to lifeboats faster than those who were deep in the center of the Titanic. Thus, resulting in a lesser ratio of survival as the id increases.
The assumption that the id affects the label is not based on the actual dataset, I’m just formulating an example.

Exclusion bias can be reduced or eliminated by:

Investigate before discarding feature(s) by doing sufficient analysis on them.
Ask a colleague to look into the feature(s) you’re considering to discard, afresh pair of eyes will definitely help.
If you’re low on time/resources and need to cut your dataset size by discarding feature(s). Before deleting any, make sure to search the relation between this feature and your label. Most probably you’ll find similar solutions, investigate whether they’ve taken into account similar features and decide then.
Better than that, since humans are subject to bias. There are tools that can help. Take a look at this article (Explaining Feature Importance by example of a Random Forest), containing various ways to calculate feature importance. Ways that contain methods that don’t require high computational resources.
3- Observer bias (aka experimenter bias)
The tendency to see what we expect to see, or what we want to see. When a researcher studies a certain group, they usually come to an experiment with prior knowledge and subjective feelings about the group being studied. In other words, they come to the table with conscious or unconscious prejudices.

Example: Is Intelligence influenced by status? — The Burt Affair
One famous example of observer bias is the work of Cyril Burt, a psychologist best known for his work on the heritability of IQ. He thought that children from families with low socioeconomic status (i.e. working class children) were also more likely to have lower intelligence, compared to children from higher socioeconomic statuses. His allegedly scientific approach to intelligence testing was revolutionary and allegedly proved that children from the working classes were in general, less intelligent. This led to the creation of a two-tier educational system in England in 1960s which sent middle and upper-class children to elite schools and working-class children to less desirable schools.
Burt’s research was later of course debunked and it was concluded he falsified data. It is now accepted that intelligence is not hereditary.

Observer bias can be reduced or eliminated by:

Ensuring that observers (people conducting experiments) are well trained.
Screening observers for potential biases.
Having clear rules and procedures in place for the experiment.
Making sure behaviors are clearly defined.

Source: Pixabay
4- Prejudice bias
Happens as a result of cultural influences or stereotypes. When things that we don’t like in our reality like judging by appearances, social class, status, gender and much more is not fixed in our machine learning model. When this model applies the same stereotyping that exists in real life due to prejudiced data it is fed.

Example: A computer vision program that detects people at work
If your goal is to detect people at work. Your model has been fed to thousands of training data where men are coding and women are cooking. The algorithm is likely to learn that coders are men and women are chefs. Which is wrong since women can code and men can cook.

The problem here is that the data is consciously or unconsciously reflecting stereotypes.

Prejudice bias can be reduced or eliminated by:

Ignoring the statistical relationship between gender and occupation.
Exposing the algorithm to a more even-handed distribution of examples.
5- Measurement bias
Systematic value distortion happens when there’s an issue with the device used to observe or measure. This kind of bias tends to skew the data in a particular direction.

Example: Shooting images data with a camera that increases the brightness.
This messed up measurement tool failed to replicate the environment on which the model will operate, in other words, it messed up its training data that it no longer represents real data that it will work on when it’s launched.

This kind of bias can’t be avoided simply by collecting more data.

Measurement bias can be reduced or eliminated by:

Having multiple measuring devices.
Hiring humans who are trained to compare the output of these devices.

Machine Learning/ Deep Learning / Machine Learning Optimization
« on: April 21, 2019, 02:34:55 AM »
What Is Machine Learning?
JeanFrancoisPuget | May 18 2016 | Visits (38723) 
Can you explain me what machine learning is?  I often get this question from colleagues and customers, and answering it is tricky.  What is tricky is to give the intuition behind what machine learning is really useful for.

I'll review common answers and give you my preferred one.

Cognitive Computing
The first category of answer to the question is what IBM calls cognitive computing.  It is about building machines (computers, software, robots, web sites, mobile apps, devices, etc) that do not need to be programmed explicitly.   This view of machine learning can be traced back to Arthur Samuel's definition from 1959:

Machine Learning: Field of study that gives computers the ability to learn without being explicitly programmed.

Arthur Samuel is one of the pioneers of machine learning.  While at IBM he developed a program that learned how to play checkers better than him.

Samuel's definition is a great definition, but maybe a little too vague.  Tom Mitchell, another well regarded machine learning researcher, proposed a more precise definition in 1998:

Well posed Learning Problem: A computer program is said to learn from experience E with respect to some task T and some performance measure P, if its performance on T, as measured by P, improves with experience E.

Let's take an example for the sake of clarity.  Let's assume we are developing a credit card fraud detection system.  The task T of that system is to flag credit card transactions as fraudulent or not.  The performance measure P could be the percentage of fraudulent transactions that are detected.  The system learns if the percentage of fraudulent transactions that are detected increases over time.  Here the experience E is the set of already processed transaction records.  Once a transaction is processed, then we know if it is a fraud or not, and we can feed that information to the system for it learn.

Note that the choice of the performance measure is critical.  The one we chose is too simplistic.  Indeed, if the system flags all transactions as fraudulent, then it achieves a 100% performance, however, this system would be useless!    We need something more sensible, like detecting as much fraud as possible, while flagging as little as possible honest transactions as fraud.  There are ways to capture this double goal fortunately, but we won't discuss them here  Point is that once we have a performance metric, then we can tell if the system learns or not from experience.

Machine Learning Algorithms
The above definitions are great as they set a clear goal for machine learning.  However, they do not tell us how to achieve that goal.   We should make our definition more specific. This brings us to the second category of definitions, who describe machine learning algorithms.  Here are some of the most popular ones.  In each case the algorithm is given a set of examples to learn from.

Supervised Learning.   The algorithm is given training data which contains the "correct answer" for each example.  For instance, a supervised learning algorithm for credit card fraud detection would take as input a set of recorded transactions.  For each transaction, the training data would contain a flag that says if it is fraudulent or not.
Unsupervised Learning.  The algorithm looks for structure in the training data, like finding which examples are similar to each other, and group them in clusters.
We have more concrete definitions, but still no clue about what to do next.

Machine Learning Problems
If defining categories of machine learning algorithms isn't good enough, then can we be more specific?  One possible way is to refine the task of machine learning by looking at classes of problems it can solve.  Here are some common ones:

Regression. A supervised learning problem where the answer to be learned is a continuous value.  For instance, the algorithm could be fed with a record of house sales with their price, and it learns how to set prices for houses.
Classification. A supervised learning problem where the answer to be learned is one of finitely many possible values.  For instance, in the credit card example the algorithm must learn how to find the right answer between 'fraud' and 'honest'.  When there are only two possible value we say it is a binary classification problem.
Segmentation. An unsupervised learning problem where the structure to be learned is a set of clusters of similar examples.  For instance, market segmentation aims at grouping customers in clusters of people with similar buying behavior.
Network analysis. An unsupervised learning problem where the structure to be learned is information about the importance and the role of nodes in the network.  For instance, the page rank algorithm analyzes the network made of web pages and their hyperlinks, and finds what are the most important pages.  This is used in web search engines like Google.  Other network analysis problem include social network analysis.
The list of problem types where machine learning can help is much longer, but I'll stop here because this isn't helping us that much.  We still don't have a definition that tells us what to do, even if we're getting closer.

Machine Learning Workflow
Issue with the above definitions is that developing a machine learning algorithm isn't good enough to get a system that learns.  Indeed, there is a gap between a machine learning algorithms and a learning system.  I discussed this gap in Machine Learning Algorithm != Learning Machine where I derived this machine learning workflow:


A machine learning algorithm is used in the 'Train' step of the workflow.  Its output (a trained model) is then used in the 'Predict' part of the workflow.  What differentiate between a good and a bad machine algorithm is the quality of predictions we will get in the 'Predict' step.  This leads us to yet another definition of machine learning:

The purpose of machine learning is to learn from training data in order to make as good as possible predictions on new, unseen, data.

This is my favorite definition, as is links the 'Train' step to the 'Predict' step of the machine learning workflow.

One thing I like with the above definition is that it explains why machine learning is hard.  We need to build a model that defines the answer as a function of the example features.  So far so good.  Issue is that we must build a model that leads to good prediction on unforeseen data.  If you think about it, this seems like an impossible task.  How can we evaluate the quality of a model without looking at the data on which we will make predictions?  Answering that question is what keeps busy researchers in Machine Learning.  The general idea is that we assume that unforeseen data is similar to the data we can see.  If a model is good on the data we can see, then it should be good for unforeseen data.  Of course, Devil is in detail, and relying blindly on the data we can see can lead to major issues known as overfitting.  I'll come back to this later, and I recommend reading Florian Dahms' What is "overfitting"? in the meantime.

A Simple Example
Let me explain the definition a bit.  Data comes in as a table (a 2D matrix) with one example per row.  Examples are described by features, with one feature per column.  There is a special column which contains the 'correct answer' (the ground truth) for each example.  The following is an example of such data set, coming from past house sales:

Name   Surface   Rooms   Pool   Price
House1   2,000   4   0   270,000
House5   3,500   6   1   510,000
House12   1,500   4   0   240,000

There are 3 examples, each described by 4 features, a name, the surface, the number of rooms, and the presence of a pool. The target is the price, represented in the last column.  The goal is to find a function that relates the price to the features, for instance:

price = 100 * surface + 20,000 * pool + 15,000 * num_room

Once we have that function, then we can use it with new data.  For instance, when we get a new house, say house22 with 2,000 sq. feet, 3 rooms, and no pool, we can compute a price:

price(house22) = 100 * 2,000 + 20,000 * 0 + 15,000 * 3 = 245,000

Let's assume that house22 is sold at 255,000.  Our predicted price is off by 10,000.  This is the prediction error that we want to minimize.  Another formula for price definition may lead to more accurate price predictions.  The goal of machine learning is to find a price formula that leads to the most accurate predictions for future house sales.

In practice, we will look for formulas that provide good predictions on the data we can see, i.e. the above table.  I say formulas, but machine learning is not limited to formulas.  Machine learning models can be much more complex.  Point is that a machine learning model can be used to compute a target (here the price) from example features.  The goal of machine learning is to find a model that leads to good predictions in the future.

Some of the definitions listed above are taken from  Andew Ng's Stanford machine learning course.  I recommend this course (or the updated version available for free on Coursera) for those willing to deep dive on machine learning.

I found a more formal statement of my favorite definition in this presentation by Peter Prettenhofer and Gille Louppe (if a reader knows when this definition was first used, then please let me know):


Resources / Re: Self-Assessment Manual (Second Edition, 2016)
« on: April 21, 2019, 02:32:43 AM »

Machine Learning/ Deep Learning / Data version control with DVC
« on: April 21, 2019, 02:30:42 AM »
DataOps is very important in data science, and that my opinion is that data scientists should pay more attention to DataOps. It’s the less used feature in data science projects. At the moment we normally are versioning code (with something like Git), and more people and organizations are starting to version their models. But what about data?

I’ll cover in detail how to use Git with DVC with other tools for versioning almost everything that goes into a data science (and scientific) project in an upcoming article.

Recently, the DVC project creator Dmitry Petrov gave an interview to Tobias Macey at Podcast.__init__ a Python podcast. In this blog post, I provide a transcript of the interview. You might find interesting the ideas behind DVC and how Dmitry sees the future of data science and data engineering.

You can hear the podcast here:

Version Control For Machine Learning Projects

An interview with the creator of DVC about how it improves collaboration and reduces duplicate effort on data science…   
We need to pay more attention on how we organize our work. We need to pay more attention how we structure our project, where we need to find the places where we waste our time instead of doing actual work. And is very important to be more organized, more productive as a data scientist, because today, we are still on the Wild West.

The transcript
Disclaimer: This transcript is a result of listening to the podcast and writing what I heard. I used some software to help me in the transcription but most of the work is made by my ears and hands, so please if you can improve this transcription feel free to leave a comment below :)
Your host as usual as Tobias Macy and today I’m interviewing Dmitry Petrov about DVC, an open source version control system for machine learning projects. So Dmitry, could you start by introducing yourself?

for more :

What if we can detect anomalies of the colon at an early stage to prevent colon cancer? We are now in a technology era that it’s capable of doing impressive things that we didn’t imagine before. The use of artificial intelligence can detect more abnormalities than a conventional exam. Physicians should take advantage of this.

According to The American Cancer Society, in the United States, colorectal cancer is the third leading cause of cancer-related deaths in men and in women, and the second most common cause of cancer deaths. It’s expected to cause about 51,020 deaths during 2019.

AI & Creativity: Deep Dream comes true - Data Driven Investor

Artificial Intelligence always fascinated me. Not only as a useful set of tools, continuously evolving, but also as an…   
Inspired by the #AISTARTUPCHALLENGE created by Siraj Raval, I decided to join the challenge! You can check it out at his Instagram account (Siraj Raval). The dynamic is to create an app that uses AI to solve a problem, get 3 paying customers for your app and submit it to win different prizes.

I will start with this healthcare project that classifies 8 different tissues in histological images of human colorectal cancer.

Colorectal Histology MNIST
Let’s get to know more about the dataset I will be using. I got this dataset at Kaggle and it contains a collection of textures in histological images of human colorectal cancer. It has about 5,000 histological RGB samples of 150X150 px, divided into eight tissue categories (specified by the folder name):

My goal is to identify each category. You can also create a model that classifies between Normal and Benign, but in this case, let’s identify all the different anomalies.

FastAi Model


Have you ever summarized a lengthy document into a short paragraph? How long did you take? Manually generating a summary can be time consuming and tedious. Automatic text summarization promises to overcome such difficulties and allow you to generate the key ideas in a piece of writing easily.
Text summarization is the technique for generating a concise and precise summary of voluminous texts while focusing on the sections that convey useful information, and without losing the overall meaning.
Automatic text summarization aims to transform lengthy documents into shortened versions, something which could be difficult and costly to undertake if done manually.
Machine learning algorithms can be trained to comprehend documents and identify the sections that convey important facts and information before producing the required summarized texts. For example, the image below is of this news article that has been fed into a machine learning algorithm to generate a summary.

An online news article that has been summarized using a text summarization machine learning algorithm
The need for text summarization
With the present explosion of data circulating the digital space, which is mostly non-structured textual data, there is a need to develop automatic text summarization tools that allow people to get insights from them easily. Currently, we enjoy quick access to enormous amounts of information. However, most of this information is redundant, insignificant, and may not convey the intended meaning. For example, if you are looking for specific information from an online news article, you may have to dig through its content and spend a lot of time weeding out the unnecessary stuff before getting the information you want. Therefore, using automatic text summarizers capable of extracting useful information that leaves out inessential and insignificant data is becoming vital. Implementing summarization can enhance the readability of documents, reduce the time spent in researching for information, and allow for more information to be fitted in a particular area.

The main types of text summarization
Broadly, there are two approaches to summarizing texts in NLP: extraction and abstraction.

Extraction-based summarization
In extraction-based summarization, a subset of words that represent the most important points is pulled from a piece of text and combined to make a summary. Think of it as a highlighter—which selects the main information from a source text.

Highlighter = Extractive-based summarization
In machine learning, extractive summarization usually involves weighing the essential sections of sentences and using the results to generate summaries.

Different types of algorithms and methods can be used to gauge the weights of the sentences and then rank them according to their relevance and similarity with one another—and further joining them to generate a summary. Here's an example:

Extractive-based summarization in action.
As you can see above, the extracted summary is composed of the words highlighted in bold, although the results may not be grammatically accurate.

Abstraction-based summarization
In abstraction-based summarization, advanced deep learning techniques are applied to paraphrase and shorten the original document, just like humans do. Think of it as a pen—which produces novel sentences that may not be part of the source document.

Pen = Abstraction-based summarization
Since abstractive machine learning algorithms can generate new phrases and sentences that represent the most important information from the source text, they can assist in overcoming the grammatical inaccuracies of the extraction techniques. Here is an example:

Abstraction-based summary in action.
Although abstraction performs better at text summarization, developing its algorithms requires complicated deep learning techniques and sophisticated language modeling.

To generate plausible outputs, abstraction-based summarization approaches must address a wide variety of NLP problems, such as natural language generation, semantic representation, and inference permutation.

As such, extractive text summarization approaches are still widely popular. In this article, we’ll be focusing on an extraction-based method.

How to perform text summarization
Let’s use a short paragraph to illustrate how extractive text summarization can be performed.

Here is the paragraph:

“Peter and Elizabeth took a taxi to attend the night party in the city. While in the party, Elizabeth collapsed and was rushed to the hospital. Since she was diagnosed with a brain injury, the doctor told Peter to stay besides her until she gets well. Therefore, Peter stayed with her at the hospital for 3 days without leaving.”
Here are the steps to follow to summarize the above paragraph, while trying to maintain its intended meaning, as much as possible.

Step 1: Convert the paragraph into sentences

First, let’s split the paragraph into its corresponding sentences. The best way of doing the conversion is to extract a sentence whenever a period appears.

1. Peter and Elizabeth took a taxi to attend the night party in the city

2. While in the party, Elizabeth collapsed and was rushed to the hospital

3. Since she was diagnosed with a brain injury, the doctor told Peter to stay besides her until she gets well

4. Therefore, Peter stayed with her at the hospital for 3 days without leaving

Step 2: Text processing

Next, let’s do text processing by removing the stop words (extremely common words with little meaning such as “and” and “the”), numbers, punctuation, and other special characters from the sentences.

Performing the filtering assists in removing redundant and insignificant information which may not provide any added value to the text’s meaning.

Here is the result of the text processing:

1. Peter Elizabeth took taxi attend night party city

2. Party Elizabeth collapse rush hospital

3. Diagnose brain injury doctor told Peter stay besides get well

4. Peter stay hospital days without leaving

Step 3: Tokenization

Tokenizing the sentences is done to get all the words present in the sentences. Here is a list of the words:

['peter','elizabeth','took','taxi','attend','night','party','city','party','elizabeth','collapse','rush','hospital', 'diagnose','brain', 'injury', 'doctor','told','peter','stay','besides','get','well','peter', 'stayed','hospital','days','without','leaving']

Step 4: Evaluate the weighted occurrence frequency of the words

Thereafter, let’s calculate the weighted occurrence frequency of all the words. To achieve this, let’s divide the occurrence frequency of each of the words by the frequency of the most recurrent word in the paragraph, which is “Peter” that occurs three times.

Here is a table that gives the weighted occurrence frequency of each of the words.

peter   3   1
elizabeth   2   0.67
took   1   0.33
taxi   1   0.33
attend   1   0.33
night   1   0.33
party   2   0.67
city   1   0.33
collapse   1   0.33
rush   1   0.33
hospital   2   0.67
diagnose   1   0.33
brain   1   0.33
injury   1   0.33
doctor   1   0.33
told   1   0.33
stay   2   0.67
besides   1   0.33
get   1   0.33
well   1   0.33
days   1   0.33
without   1   0.33
leaving   1   0.33
Step 5: Substitute words with their weighted frequencies

Let’s substitute each of the words found in the original sentences with their weighted frequencies. Then, we’ll compute their sum.

Since the weighted frequencies of the insignificant words, such as stop words and special characters, which were removed during the processing stage, is zero, it’s not necessary to add them.

1   Peter and Elizabeth took a taxi to attend the night party in the city   1 + 0.67 + 0.33 + 0.33 + 0.33 + 0.33 + 0.67 + 0.33   3.99
2   While in the party, Elizabeth collapsed and was rushed to the hospital   0.67 + 0.67 + 0.33 + 0.33 + 0.67   2.67
3   Since she was diagnosed with a brain injury, the doctor told Peter to stay besides her until she gets well.   0.33 + 0.33 + 0.33 + 0.33 + 1 + 0.33 + 0.33 + 0.33 + 0.33 +0.33   3.97
4   Therefore, Peter stayed with her at the hospital for 3 days without leaving   1 + 0.67 + 0.67 + 0.33 + 0.33 + 0.33   3.33
From the sum of the weighted frequencies of the words, we can deduce that the first sentence carries the most weight in the paragraph. Therefore, it can give the best representative summary of what the paragraph is about.

Furthermore, if the first sentence is combined with the third sentence, which is the second-most weighty sentence in the paragraph, a better summary can be generated.

The above example just gives a basic illustration of how to perform extraction-based text summarization in machine learning. Now, let’s see how we can apply the concept above in creating a real-world summary generator.


The hungarian algorithm, also known as Kuhn-Munkres algorithm, can associate an obstacle from one frame to another, based on a score. We have many scores we can think of :

IOU (Intersection Over Union); meaning that if the bounding box is overlapping the previous one, it’s probably the same.
Shape Score ; if the shape or size didn’t vary too much during two consecutives frames; the score increases.
Convolution Cost ; we could run a CNN (Convolutional Neural Network) on the bounding box and compare this result with the one from a frame ago. If the convolutional features are the same, then it means the objects looks the same. If there is a partial occlusion, the convolutional features will stay partly the same and association will remain.


Artificial Intelligence / Computer Vision for tracking
« on: April 21, 2019, 02:22:57 AM »
In Computer Vision, one of the most interesting area of research is obstacle detection using Deep Neural Networks. A lot of papers went out, all achieving SOTA (State of the Art) in detecting obstacles with a really high accuracy. The goal of these algorithms is to predict a list of bounding boxes from an input image. Machine Learning has evolved really well into localising and classifying obstacles in real-time in an image. However, none of these algorithm include the notion of time and continuity. When detecting an obstacle, these algorithms assume it’s a new obstacle every time.

I won’t go into the details of the algorithm here, but you can have a look at this video from Siraj Raval that explains it very well.
The output of the algorithm is a list of bounding box, in format [class, x, y, w, h, confidence]. The class is an id related to a number in a txt file (0 for car , 1 for pedestrian, …). x, y, w and h represent the parameters of the bounding box. x and y are the coordinates of the center while w and h are its size (width and height). The confidence is a number expressed in %.


Alibaba says its deep neural network model has outscored humans in a global reading test, paving the way for the underlying technology to reduce the need for human input.

The Chinese tech giant's research unit, Institute of Data Science of Technologies (IDST), said it had developed a deep-learning model that attained a score of 82.44 in Exact Match on the Stanford Question Answering Dataset (SQuAD). Humans had clocked a previous score of 82.304, it said.

SQuAD is comprised more than 100,000 question-and-answer sets based on more than 500 Wikipedia articles, in which participants were required to build machine-learning models to respond to the questions. These models would be evaluated by SQuAD, which then would run the model on the test set.

Various universities, research institutions, and technology vendors were participants including Tencent, Google, IBM, Microsoft, Samsung, Tel-Aviv University, and South Korea's Kangwon National University. A handful had participated multiple times in the past year including Microsoft Research Asia, which previous score of 82.136 was clocked on December 17, 2017, while Alibaba's previous score of 79.199 was recorded on December 28, 2017.

In its statement Monday, the Chinese vendor said it was the first to surpass humans in the test, but SQuAD listed the Chinese vendor as shared leader alongside Microsoft Research Asia, which scored a higher 82.65. SQuAD highlighted Microsoft's rank as "January 3, 2018", while Alibaba's was "January 5, 2018".

A spokesperson for Alibaba explained that the dates indicated when the respective model was submitted. He told ZDNet that the actual test results officially registered by SQuAD for Alibaba was January 11, 2018--a day ahead of Microsoft's--which gave the Chinese vendor the distinction of being "first" to surpass human scores.

According to Alibaba, its neural network model was based on the Hierarchical Attention Network, which it explained would read "from paragraphs to sentences to words" to identify phases that could hold potential answers. This underlying technology previously was used in its Singles Day shopping festival to respond to customer inquiries.

The company had said its AI-powered customer service chatbot, Dian Xiaomi, was used to support its online merchants and served an average of 3.5 million users daily across its Taobao and Tmall platforms.

Commenting on the SQuAD score, Alibaba IDST's chief scientist of natural language processing Si Luo, said: "That means objective questions such as 'what causes rain' can now be answered with high accuracy by machines. We believe the technology underneath can be gradually applied to numerous applications such as customer service, museum tutorials, and online responses to medical inquiries from patients, decreasing the need for human input in an unprecedented way."

Si said Alibaba would be "sharing our model-building methodology" with the community and planned to apply the technology to support its customers in the near future.

SQuAD's current ranking



thanks for sharing


Internet of Things / Re: How Blockchains Help IoT
« on: April 21, 2019, 02:10:00 AM »

Advances in of Natural Language Processing and Machine Learning are broadening the scope of what technology can do in people’s everyday lives, and because of this, there is an unprecedented number of people developing a curiosity in the fields. And with the availability of educational content online, it has never been easier to go from curiosity to proficiency.

We gathered some of our favorite resources together so you will have a jumping-off point into studying these fields on your own. Some of the resources here are suitable for absolute beginners in either Natural Language Processing or Machine Learning, and others are suitable for those with an understanding of one who wish to learn more about the other.

We’ve split these resources into two categories:

Online courses and textbooks for structured learning experiences and reference material
NLP and Machine Learning blogs to benefit from the work of some researchers and students who distill current advances in research into interesting and readable posts.
The resources on this post are 12 of the best, not the 12 best, and as such should be taken as suggestions on where to start learning without spending a cent, nothing more!

Read more at:

Pages: 1 [2] 3 4 ... 18