Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Topics - s.arman

Pages: 1 2 [3] 4

Machine Learning/ Deep Learning / Slow Learning

« on: November 19, 2018, 06:43:14 PM »

In the most recent edition of The Economist, an article titled “New schemes teach the masses to learn AI” appeared. The article profiles the efforts of fast.ai, a Bay Area non-profit that aims to demystify deep learning and equip the masses to use the technology. I was mentioned in the article as an example of the success of this approach — “A graduate from fast.ai’s first year, Sara Hooker, was hired into Google’s highly competitive ai residency program after finishing the course, having never worked on deep learning before.”

I have spent the last few days feeling uneasy about the article. On the one hand, I do not want to distract from the recognition of fast.ai. Rachel and Jeremy are both people that I admire, and their work to provide access to thousands of students across the world is both needed and one of the first programs of its kind. However, not voicing my unease is equally problematic since it endorses a simplified narrative that is misleading for others who seek to enter this field.

It is true that I both attended the first session of fast.ai and that I was subsequently offered a role as an AI Resident at Google Brain. Nevertheless, attributing my success to a part-time evening 12-week course (parts 1 and 2) creates the false impression of a quick Cinderella story for anyone who wants to teach themselves machine learning. Furthermore, this implication minimizes my own effort and journey.

For some time, I have had clarity about what I love to do. I was not exposed to either machine learning or computer science during my undergraduate degree. I grew up in Africa, in Mozambique, Lesotho, Swaziland and South Africa. My family currently lives in Monrovia, Liberia. My first trip to the US was a flight to Minnesota, where I had accepted a scholarship to attend a small liberal arts school called Carleton College. I arrived for international student orientation without ever having seen the campus before. Coming from Africa, I also did not have any reference point for understanding how cold Minnesota’s winters would be. Despite the severe weather, I enjoyed a wonderful four years studying a liberal arts curriculum and majoring in Economics. My dream had been to be an economist for the World Bank. This was in part because the most technical people I was exposed to during my childhood were economists from organizations like the International Monetary Fund and the World Food Program.

I decided to delay applying for a PhD in economics until a few years after graduation, instead accepting an offer to work with PhD economists in the Bay Area on antitrust issues. We applied economic modeling and statistics to real world cases and datasets to assess whether price fixing had taken place or to determine whether a firm was misusing its power to harm consumers.

First Delta Analytics Presentation to Local Bay Area Non-Profits. Early 2014.
A few months after I moved to San Francisco, myself and some fellow economists (Jonathan Wang, Cecilia Cheng, Asim Manizada, Tom Shannahan, and Eytan Schindelhaim) started meeting on weekends to volunteer for nonprofits. We didn’t really know what we were doing, but we thought offering our data skills to non-profits for free might be a useful way of giving back. We emailed a Bay Area non-profit listserv and were amazed by the number of responses. We clearly saw that many non-profits possessed data, but they were uncertain on how to use it to accelerate their impact. That year, we registered as a non-profit called Delta Analytics and were joined by volunteers that worked as engineers, data analysts and researchers. Delta remains entirely run by volunteers, does not have any full time staff, and offers all engagements with non-profits for free. By the time I applied to the Google AI Residency, we had completed projects with over 30 non-profits.

Second cohort of Delta Analytics Volunteers. 2016.
Delta was a turning point in my journey because the data of the partners we worked with was often messy and unstructured. The assumptions required to impose a linear model (such as homoscedasticity, no autocorrelation, normal distribution) were rarely present. I saw first-hand how linear functions, a favorite tool of economists, fell short. I decided that I wanted to know more about more complex forms of modeling.

I joined a startup called Udemy as a data analyst. At the time, Udemy was a 150-person startup that aimed to help anyone learn anything. My boss carved out projects for me that were challenging, had wide impact and pushed me technically. One of the key projects I worked on during my first year was collecting data, developing and deploying Udemy’s first spam detection algorithm.

Working on projects like spam detection convinced me that I wanted to grow technically as an engineer. I wanted to be able to iterate quickly and have end-to-end control over the models I worked on, including deploying them into production. This required becoming proficient at coding. I had started my career working in STATA (a statistical package similar to MATLAB), R, and SQL. Now, I wanted to become fluent at Python. I took part-time night classes at Hackbright and started waking up at 4 am most days to practice coding before work. This is still a regular habit, although now I do so to read papers not directly related to my field of research and carve out time for new areas I want to learn about.

After half a year, while I had improved at coding, I was still not proficient enough to interview as an engineer. At the time, the Udemy data science team was separate from my Analytics team. Udemy invested in me. They approved my transfer to engineering where I started as the first non-PhD data scientist. I worked on recommendation algorithms and learned how to deploy models at scale to millions of people. The move to engineering accelerated my technical growth and allowed me to continue to improve as an engineer.

Udemy data team.
In parallel to my growth at Udemy, I was still working on Delta projects. There are two that I particularly enjoyed, the first (alongside Steven Troxler, Kago Kagichiri, Moses Mutuku) was working with Eneza Education, a ed-tech social impact company in Nairobi, Kenya. Eneza used pre-smartphone technology to empower more than 4 million primary and secondary students to access practice quizzes by mobile texting. Eneza’s data provided wonderful insights into cell phone usage in Kenya as well as the community’s learning practices. We worked on identifying difficult quizzes that deterred student activity and improved tailoring pathways to individual need and ability. The second project was with Rainforest Connection (alongside Sean McPherson, Stepan Zapf, Steven Troxler, Cassandra Jacobs, Christopher Kaushaar) where the goal was to identify illegal deforestation using streamed audio from the rainforest. We worked on infrastructure to convert the audio into spectrograms. Once converted, we structured the problem as image classification and used convolutional neural networks to detect whether chainsaws were present in the audio stream. We also worked on models to better triangulate the sound detected by the recycled cellphones.

Source: https://medium.com/@sarahooker/slow-learning-d9463f6a800b

Artificial Intelligence / Artificial Intelligence Hits the Barrier of Meaning

« on: November 19, 2018, 06:42:02 PM »

You’ve probably heard that we’re in the midst of an A.I. revolution. We’re told that machine intelligence is progressing at an astounding rate, powered by “deep learning” algorithms that use huge amounts of data to train complicated programs known as “neural networks.”

Today’s A.I. programs can recognize faces and transcribe spoken sentences. We have programs that can spot subtle financial fraud, find relevant web pages in response to ambiguous queries, map the best driving route to almost any destination, beat human grandmasters at chess and Go, and translate between hundreds of languages. What’s more, we’ve been promised that self-driving cars, automated cancer diagnoses, housecleaning robots and even automated scientific discovery are on the verge of becoming mainstream.

The Facebook founder, Mark Zuckerberg, recently declared that over the next five to 10 years, the company will push its A.I. to “get better than human level at all of the primary human senses: vision, hearing, language, general cognition.” Shane Legg, chief scientist of Google’s DeepMind group, predicted that “human-level A.I. will be passed in the mid-2020s.”

As someone who has worked in A.I. for decades, I’ve witnessed the failure of similar predictions of imminent human-level A.I., and I’m certain these latest forecasts will fall short as well. The challenge of creating humanlike intelligence in machines remains greatly underestimated. Today’s A.I. systems sorely lack the essence of human intelligence: understanding the situations we experience, being able to grasp their meaning. The mathematician and philosopher Gian-Carlo Rota famously asked, “I wonder whether or when A.I. will ever crash the barrier of meaning.” To me, this is still the most important question.

ADVERTISEMENT

The lack of humanlike understanding in machines is underscored by recent cracks that have appeared in the foundations of modern A.I. While today’s programs are much more impressive than the systems we had 20 or 30 years ago, a series of research studies have shown that deep-learning systems can be unreliable in decidedly unhumanlike ways.

I’ll give a few examples.

“The bareheaded man needed a hat” is transcribed by my phone’s speech-recognition program as “The bear headed man needed a hat.” Google Translate renders “I put the pig in the pen” into French as “Je mets le cochon dans le stylo” (mistranslating “pen” in the sense of a writing instrument).

source: https://medium.com/@sarahooker/slow-learning-d9463f6a800b

Software Engineering / Ten Machine Learning Algorithms You Should Know to Become a Data Scientist

« on: July 05, 2018, 06:23:21 PM »

Machine Learning Practitioners have different personalities. While some of them are “I am an expert in X and X can train on any type of data”, where X = some algorithm, some others are “Right tool for the right job people”. A lot of them also subscribe to “Jack of all trades. Master of one” strategy, where they have one area of deep expertise and know slightly about different fields of Machine Learning. That said, no one can deny the fact that as practicing Data Scientists, we will have to know basics of some common machine learning algorithms, which would help us engage with a new-domain problem we come across. This is a whirlwind tour of common machine learning algorithms and quick resources about them which can help you get started on them.

1. Principal Component Analysis(PCA)/SVD
PCA is an unsupervised method to understand global properties of a dataset consisting of vectors. Covariance Matrix of data points is analyzed here to understand what dimensions(mostly)/ data points (sometimes) are more important (ie have high variance amongst themselves, but low covariance with others). One way to think of top PCs of a matrix is to think of its eigenvectors with highest eigenvalues. SVD is essentially a way to calculate ordered components too, but you don’t need to get the covariance matrix of points to get it.

This Algorithm helps one fight curse of dimensionality by getting datapoints with reduced dimensions.

Libraries:
https://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.svd.html

http://scikit-learn.org/stable/modules/generated/sklearn.decomposition.PCA.html

Introductory Tutorial:
https://arxiv.org/pdf/1404.1100.pdf

2a. Least Squares and Polynomial Fitting
Remember your Numerical Analysis code in college, where you used to fit lines and curves to points to get an equation. You can use them to fit curves in Machine Learning for very small datasets with low dimensions. (For large data or datasets with many dimensions, you might just end up terribly overfitting, so don’t bother). OLS has a closed form solution, so you don’t need to use complex optimization techniques.

As is obvious, use this algorithm to fit simple curves / regression

Libraries:
https://docs.scipy.org/doc/numpy/reference/generated/numpy.linalg.lstsq.htmlhttps://docs.scipy.org/doc/numpy-1.10.0/reference/generated/numpy.polyfit.html

Introductory Tutorial:
https://lagunita.stanford.edu/c4x/HumanitiesScience/StatLearning/asset/linear_regression.pdf

2b. Constrained Linear Regression
Least Squares can get confused with outliers, spurious fields and noise in data. We thus need constraints to decrease the variance of the line we fit on a dataset. The right method to do it is to fit a linear regression model which will ensure that the weights do not misbehave. Models can have L1 norm (LASSO) or L2 (Ridge Regression) or both (elastic regression). Mean Squared Loss is optimized.

Use these algorithms to fit regression lines with constraints, avoiding overfitting and masking noise dimensions from model.

Libraries:
http://scikit-learn.org/stable/modules/linear_model.html

Introductory Tutorial(s):

3. K means Clustering
Everyone’s favorite unsupervised clustering algorithm. Given a set of data points in form of vectors, we can make clusters of points based on distances between them. It’s an Expectation Maximization algorithm that iteratively moves the centers of clusters and then clubs points with each cluster centers. The input the algorithm has taken is the number of clusters which are to be generated and the number of iterations in which it will try to converge clusters.

As is obvious from the name, you can use this algorithm to create K clusters in dataset

Library:
http://scikit-learn.org/stable/modules/generated/sklearn.cluster.KMeans.html

Introductory Tutorial(s):

https://www.datascience.com/blog/k-means-clustering

4. Logistic Regression
Logistic Regression is constrained Linear Regression with a nonlinearity (sigmoid function is used mostly or you can use tanh too) application after weights are applied, hence restricting the outputs close to +/- classes (which is 1 and 0 in case of sigmoid). Cross-Entropy Loss functions are optimized using Gradient Descent. A note to beginners: Logistic Regression is used for classification, not regression. You can also think of Logistic regression as a one layered Neural Network. Logistic Regression is trained using optimization methods like Gradient Descent or L-BFGS. NLP people will often use it with the name of Maximum Entropy Classifier.

This is what a Sigmoid looks like:

Use LR to train simple, but very robust classifiers.

Library:
http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html

Introductory Tutorial(s):

5. SVM (Support Vector Machines)
SVMs are linear models like Linear/ Logistic Regression, the difference is that they have different margin-based loss function (The derivation of Support Vectors is one of the most beautiful mathematical results I have seen along with eigenvalue calculation). You can optimize the loss function using optimization methods like L-BFGS or even SGD.

Another innovation in SVMs is the usage of kernels on data to feature engineer. If you have good domain insight, you can replace the good-old RBF kernel with smarter ones and profit.

One unique thing that SVMs can do is learn one class classifiers.

SVMs can used to Train a classifier (even regressors)

Library:
http://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html

Introductory Tutorial(s):

Note: SGD based training of both Logistic Regression and SVMs are found in SKLearn’s http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.SGDClassifier.html , which I often use as it lets me check both LR and SVM with a common interface. You can also train it on >RAM sized datasets using mini batches.

6. Feedforward Neural Networks
These are basically multilayered Logistic Regression classifiers. Many layers of weights separated by non-linearities (sigmoid, tanh, relu + softmax and the cool new selu). Another popular name for them is Multi-Layered Perceptrons. FFNNs can be used for classification and unsupervised feature learning as autoencoders.

Multi-Layered perceptron
FFNN as an autoencoder
FFNNs can be used to train a classifier or extract features as autoencoders

Libraries:
http://scikit-learn.org/stable/modules/generated/sklearn.neural_network.MLPClassifier.html#sklearn.neural_network.MLPClassifier

http://scikit-learn.org/stable/modules/generated/sklearn.neural_network.MLPRegressor.html

https://github.com/keras-team/keras/blob/master/examples/reuters_mlp_relu_vs_selu.py

Introductory Tutorial(s):
http://www.deeplearningbook.org/contents/mlp.html

http://www.deeplearningbook.org/contents/autoencoders.html

http://www.deeplearningbook.org/contents/representation.html

7. Convolutional Neural Networks (Convnets)
Almost any state of the art Vision based Machine Learning result in the world today has been achieved using Convolutional Neural Networks. They can be used for Image classification, Object Detection or even segmentation of images. Invented by Yann Lecun in late 80s-early 90s, Convnets feature convolutional layers which act as hierarchical feature extractors. You can use them in text too (and even graphs).

Use convnets for state of the art image and text classification, object detection, image segmentation.

Libraries:
https://developer.nvidia.com/digits

https://github.com/kuangliu/torchcv

https://github.com/chainer/chainercv

https://keras.io/applications/

Introductory Tutorial(s):
http://cs231n.github.io/

https://adeshpande3.github.io/A-Beginner%27s-Guide-To-Understanding-Convolutional-Neural-Networks/

8. Recurrent Neural Networks (RNNs):
RNNs model sequences by applying the same set of weights recursively on the aggregator state at a time t and input at a time t (Given a sequence has inputs at times 0..t..T, and have a hidden state at each time t which is output from t-1 step of RNN). Pure RNNs are rarely used now but its counterparts like LSTMs and GRUs are state of the art in most sequence modeling tasks.

RNN (If here is a densely connected unit and a nonlinearity, nowadays f is generally LSTMs or GRUs ). LSTM unit which is used instead of a plain dense layer in a pure RNN.

Use RNNs for any sequence modelling task specially text classification, machine translation, language modelling

Library:
https://github.com/tensorflow/models (Many cool NLP research papers from Google are here)

https://github.com/wabyking/TextClassificationBenchmark

http://opennmt.net/

9. Conditional Random Fields (CRFs)
CRFs are probably the most frequently used models from the family of Probabilitic Graphical Models (PGMs). They are used for sequence modeling like RNNs and can be used in combination with RNNs too. Before Neural Machine Translation systems came in CRFs were the state of the art and in many sequence tagging tasks with small datasets, they will still learn better than RNNs which require a larger amount of data to generalize. They can also be used in other structured prediction tasks like Image Segmentation etc. CRF models each element of the sequence (say a sentence) such that neighbors affect a label of a component in a sequence instead of all labels being independent of each other.

Use CRFs to tag sequences (in Text, Image, Time Series, DNA etc.)

Library:
https://sklearn-crfsuite.readthedocs.io/en/latest/

Introductory Tutorial(s):
http://blog.echen.me/2012/01/03/introduction-to-conditional-random-fields/

7 part lecture series by Hugo Larochelle on Youtube:

10. Decision Trees
Let’s say I am given an Excel sheet with data about various fruits and I have to tell which look like Apples. What I will do is ask a question “Which fruits are red and round ?” and divide all fruits which answer yes and no to the question. Now, All Red and Round fruits might not be apples and all apples won’t be red and round. So I will ask a question “Which fruits have red or yellow color hints on them? ” on red and round fruits and will ask “Which fruits are green and round ?” on not red and round fruits. Based on these questions I can tell with considerable accuracy which are apples. This cascade of questions is what a decision tree is. However, this is a decision tree based on my intuition. Intuition cannot work on high dimensional and complex data. We have to come up with the cascade of questions automatically by looking at tagged data. That is what Machine Learning based decision trees do. Earlier versions like CART trees were once used for simple data, but with bigger and larger dataset, the bias-variance tradeoff needs to solved with better algorithms. The two common decision trees algorithms used nowadays are Random Forests (which build different classifiers on a random subset of attributes and combine them for output) and Boosting Trees (which train a cascade of trees one on top of others, correcting the mistakes of ones below them).

Decision Trees can be used to classify datapoints (and even regression)

Libraries
http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html

http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.GradientBoostingClassifier.html

http://xgboost.readthedocs.io/en/latest/

https://catboost.yandex/

Introductory Tutorial:
http://xgboost.readthedocs.io/en/latest/model.html

https://arxiv.org/abs/1511.05741

https://arxiv.org/abs/1407.7502

http://education.parrotprediction.teachable.com/p/practical-xgboost-in-python

TD Algorithms (Good To Have)
If you are still wondering how can any of the above methods solve tasks like defeating Go world champion like DeepMind did, they cannot. All the 10 type of algorithms we talked about before this was Pattern Recognition, not strategy learners. To learn strategy to solve a multi-step problem like winning a game of chess or playing Atari console, we need to let an agent-free in the world and learn from the rewards/penalties it faces. This type of Machine Learning is called Reinforcement Learning. A lot (not all) of recent successes in the field is a result of combining perception abilities of a convnet or a LSTM to a set of algorithms called Temporal Difference Learning. These include Q-Learning, SARSA and some other variants. These algorithms are a smart play on Bellman’s equations to get a loss function that can be trained with rewards an agent gets from the environment.

These algorithms are used to automatically play games mostly

, also other applications in language generation and object detection.

Libraries:
https://github.com/keras-rl/keras-rl

https://github.com/tensorflow/minigo

Introductory Tutorial(s):
Grab the free Sutton and Barto book: https://web2.qatar.cmu.edu/~gdicaro/15381/additional/SuttonBarto-RL-5Nov17.pdf

Watch David Silver course:

These are the 10 machine learning algorithms which you can learn to become a data scientist.

source: towardsdatascience.com

Software Engineering / Understanding and Building an Object Detection Model from Scratch in Python

« on: July 05, 2018, 06:13:12 PM »

When we’re shown an image, our brain instantly recognizes the objects contained in it. On the other hand, it takes a lot of time and training data for a machine to identify these objects. But with the recent advances in hardware and deep learning, this computer vision field has become a whole lot easier and more intuitive.

Check out the below image as an example. The system is able to identify different objects in the image with incredible accuracy.

Object detection technology has seen a rapid adoption rate in various and diverse industries. It helps self-driving cars safely navigate through traffic, spots violent behavior in a crowded place, assists sports teams analyze and build scouting reports, ensures proper quality control of parts in manufacturing, among many, many other things. And these are just scratching the surface of what object detection technology can do!

In this article, we will understand what object detection is and look at a few different approaches one can take to solve problems in this space. Then we will deep dive into building our own object detection system in Python. By the end of the article, you will have enough knowledge to take on different object detection challenges on your own!

Note: This tutorial assumes that you know the basics of deep learning and have solved simple image processing problems before. In case you haven’t, or need a refresher, I recommend reading the following articles first:

Fundamentals of Deep Learning – Starting with Artificial Neural Network
Deep Learning for Computer Vision – Introduction to Convolution Neural Networks
Tutorial: Optimizing Neural Networks using Keras (with Image recognition case study)

For more : https://www.analyticsvidhya.com/blog/2018/06/understanding-building-object-detection-model-python/

Software Engineering / Using the Power of Deep Learning for Cyber Security

« on: July 05, 2018, 06:12:36 PM »

The majority of the deep learning applications that we see in the community are usually geared towards fields like marketing, sales, finance, etc. We hardly ever read articles or find resources about deep learning being used to protect these products, and the business, from malware and hacker attacks.

While the big technology companies like Google, Facebook, Microsoft, and Salesforce have already embedded deep learning into their products, the cybersecurity industry is still playing catch up. It’s a challenging field but one that needs our full attention.

For more visit : https://www.analyticsvidhya.com/blog/2018/07/using-power-deep-learning-cyber-security/

Software Engineering / Top 28 Cheat Sheets for Machine Learning, Data Science, Probability, SQL

« on: July 05, 2018, 06:09:33 PM »

Data Science is an ever-growing field, there are numerous tools & techniques to remember. It is not possible for anyone to remember all the functions, operations and formulas of each concept. That’s why we have cheat sheets. But there are a plethora of cheat sheets available out there, choosing the right cheat sheet is a tough task. So, I decided to write this article.

Here I have selected the cheat sheets on the following criteria: comprehensiveness, clarity, and content.

After applying these filters, I have collated some 28 cheat sheets on machine learning, data science, probability, SQL and Big Data. For your convenience, I have segregated the cheat sheets separately for each of the above topics. There are cheat sheets on tools & techniques, various libraries & languages.

For more visit : https://www.analyticsvidhya.com/blog/2017/02/top-28-cheat-sheets-for-machine-learning-data-science-probability-sql-big-data/

Software Engineering / What is Design Pattern and why it's needed?

« on: July 05, 2018, 06:06:31 PM »

শুরুতেই বলি Design Pattern কোন স্পেসিফিক Programming language এর টপিক না । এটা কিছু স্ট্যান্ডার্ড সফটওয়্যার Designing Concept প্রেজেন্ট করে। যেকোনো জটিল সিস্টেম সঠিক Design pattern ফলো করে Design করলে সেটা ডেভেলপ করা সহজ হয়, ভাল পারফর্মেন্স পাওয়া যায়, এবং সবচেয়ে গুরুত্বপূর্ণ সিস্টেম মেইন্টেইন, এক্সটেনশন করা আরো সহজ হয়। সোজা কথায় Design pattern হচ্ছে সফটওয়্যার সিস্টেম আর্কিটেকচার ।

To read the full blog visit: https://hnjaman.blogspot.com/2018/06/design-patterns.html

Software Engineering / Facebook-Cambridge Analytica Data Sharing policy -

« on: July 03, 2018, 04:46:00 PM »

A federal investigation into Facebook's sharing of data with political consultancy Cambridge Analytica has broadened to focus on the actions and statements of the tech giant and now involves three agencies, including the Securities and Exchange Commission, according to people familiar with the official inquiries.

Representatives for the FBI, the SEC and the Federal Trade Commission have joined the Department of Justice in its inquiries about the two companies and the sharing of personal information of 71 million Americans, suggesting the wide-ranging nature of the investigation, said five people, who spoke on the condition of anonymity to discuss a probe that remains incomplete.

Source: https://gadgets.ndtv.com/social-networking/news/facebook-cambridge-analytica-data-sharing-probe-said-to-now-include-fbi-sec-ftc-1876955

Software Engineering / Oracle Soar to the Cloud

« on: June 11, 2018, 03:59:23 PM »

VIsit the link: http://www.oracle.com/us/corporate/events/soar/index.html?intcmp=ocom-hp=060518

Software Engineering / Private browsing gets more private

« on: June 11, 2018, 01:44:26 PM »

New system patches security holes left open by web browsers’ private-browsing functions.

Today, most web browsers have private-browsing modes, in which they temporarily desist from recording the user’s browsing history.

But data accessed during private browsing sessions can still end up tucked away in a computer’s memory, where a sufficiently motivated attacker could retrieve it.

This week, at the Network and Distributed Systems Security Symposium, researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) and Harvard University presented a paper describing a new system, dubbed Veil, that makes private browsing more private.

Veil would provide added protections to people using shared computers in offices, hotel business centers, or university computing centers, and it can be used in conjunction with existing private-browsing systems and with anonymity networks such as Tor, which was designed to protect the identity of web users living under repressive regimes.

“Veil was motivated by all this research that was done previously in the security community that said, ‘Private-browsing modes are leaky — Here are 10 different ways that they leak,’” says Frank Wang, an MIT graduate student in electrical engineering and computer science and first author on the paper. “We asked, ‘What is the fundamental problem?’ And the fundamental problem is that [the browser] collects this information, and then the browser does its best effort to fix it. But at the end of the day, no matter what the browser’s best effort is, it still collects it. We might as well not collect that information in the first place.”

Wang is joined on the paper by his two thesis advisors: Nickolai Zeldovich, an associate professor of electrical engineering and computer science at MIT, and James Mickens, an associate professor of computer science at Harvard.

Shell game

With existing private-browsing sessions, Wang explains, a browser will retrieve data much as it always does and load it into memory. When the session is over, it attempts to erase whatever it retrieved.

But in today’s computers, memory management is a complex process, with data continuously moving around between different cores (processing units) and caches (local, high-speed memory banks). When memory banks fill up, the operating system might transfer data to the computer’s hard drive, where it could remain for days, even after it’s no longer being used.

Generally, a browser won’t know where the data it downloaded has ended up. Even if it did, it wouldn’t necessarily have authorization from the operating system to delete it.

Veil gets around this problem by ensuring that any data the browser loads into memory remains encrypted until it’s actually displayed on-screen. Rather than typing a URL into the browser’s address bar, the Veil user goes to the Veil website and enters the URL there. A special server — which the researchers call a blinding server — transmits a version of the requested page that’s been translated into the Veil format.

The Veil page looks like an ordinary webpage: Any browser can load it. But embedded in the page is a bit of code — much like the embedded code that would, say, run a video or display a list of recent headlines in an ordinary page — that executes a decryption algorithm. The data associated with the page is unintelligible until it passes through that algorithm.

Decoys

Once the data is decrypted, it will need to be loaded in memory for as long as it’s displayed on-screen. That type of temporarily stored data is less likely to be traceable after the browser session is over. But to further confound would-be attackers, Veil includes a few other security features.

One is that the blinding servers randomly add a bunch of meaningless code to every page they serve. That code doesn’t affect the way a page looks to the user, but it drastically changes the appearance of the underlying source file. No two transmissions of a page served by a blinding sever look alike, and an adversary who managed to recover a few stray snippets of decrypted code after a Veil session probably wouldn’t be able to determine what page the user had visited.

If the combination of run-time decryption and code obfuscation doesn’t give the user an adequate sense of security, Veil offers an even harder-to-hack option. With this option, the blinding server opens the requested page itself and takes a picture of it. Only the picture is sent to the Veil user, so no executable code ever ends up in the user’s computer. If the user clicks on some part of the image, the browser records the location of the click and sends it to the blinding server, which processes it and returns an image of the updated page.

The back end

Veil does, of course, require web developers to create Veil versions of their sites. But Wang and his colleagues have designed a compiler that performs this conversion automatically. The prototype of the compiler even uploads the converted site to a blinding server. The developer simply feeds the existing content for his or her site to the compiler.

A slightly more demanding requirement is the maintenance of the blinding servers. These could be hosted by either a network of private volunteers or a for-profit company. But site managers may wish to host Veil-enabled versions of their sites themselves. For web services that already emphasize the privacy protections they afford their customers, the added protections provided by Veil could offer a competitive advantage.

“Veil attempts to provide a private browsing mode without relying on browsers,” says Taesoo Kim, an assistant professor of computer science at Georgia Tech, who was not involved in the research. “Even if end users didn't explicitly enable the private browsing mode, they still can get benefits from Veil-enabled websites. Veil aims to be practical — it doesn't require any modification on the browser side — and to be stronger — taking care of other corner cases that browsers do not have full control of.”

Source : MIT news

Software Engineering / Device allows a personal computer to process huge graphs

« on: June 11, 2018, 01:43:38 PM »

In data-science parlance, graphs are structures of nodes and connecting lines that are used to map scores of complex data relationships. Analyzing graphs is useful for a broad range of applications, such as ranking webpages, analyzing social networks for political insights, or plotting neuron structures in the brain.

Consisting of billions of nodes and lines, however, large graphs can reach terabytes in size. The graph data are typically processed in expensive dynamic random access memory (DRAM) across multiple power-hungry servers.

Researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) have now designed a device that uses cheap flash storage — the type used in smartphones — to process massive graphs using only a single personal computer.

Flash storage is typically far slower than DRAM at processing graph data. But the researchers developed a device consisting of a flash chip array and computation “accelerator,” that helps flash achieve DRAM-like performance.

Powering the device is a novel algorithm that sorts all access requests for graph data into a sequential order that flash can access quickly and easily. It also merges some requests to reduce the overhead — the combined computation time, memory, bandwidth, and other computing resources — of sorting.

The researchers ran the device against several traditional high-performance systems processing several large graphs, including the massive Web Data Commons Hyperlink Graph, which has 3.5 billion nodes and 128 billion connecting lines. To process that graph, the traditional systems all required a server that cost thousands of dollars and contained 128 gigabytes of DRAM. The researchers achieved the same performance by plugging two of their devices — totaling 1 gigabyte of DRAM and 1 terabyte of flash — into a desktop computer. Moreover, by combining several devices, they could process massive graphs — up to 4 billion nodes and 128 billion connecting lines — that no other system could handle on the 128-gigabyte server.

“The bottom line is that we can maintain performance with much smaller, fewer, and cooler — as in temperature and power consumption — machines,” says Sang-Woo Jun, a CSAIL graduate student and first author on a paper describing the device, which is being presented at the International Symposium on Computer Architecture (ISCA).

The device could be used to cut costs and energy associated with graph analytics, and even improve performance, in a broad range of applications. The researchers, for instance, are currently creating a program that could identify genes that cause cancers. Major tech companies such as Google could also leverage the devices to reduce their energy footprint by using far fewer machines to run analytics.

“Graph processing is such a general idea,” says co-author Arvind, the Johnson Professor in Computer Science Engineering. “What does page ranking have in common with gene detection? For us, it’s the same computation problem — just different graphs with different meanings. The type of application someone develops will determine the impact it has on society.”

Paper co-authors are CSAIL graduate student Shuotao Xu, and Andy Wright and Sizhuo Zhang, two graduate students in CSAIL and the Department of Electrical Engineering and Computer Science.

Sort and reduce

In graph analytics, a system will basically search for and update a node’s value based on its connections with other nodes, among other metrics. In webpage ranking, for instance, each node represents a webpage. If node A has a high value and connects to node B, then node B’s value will also increase.

Traditional systems store all graph data in DRAM, which makes them fast at processing the data but also expensive and power-hungry. Some systems offload some data storage to flash, which is cheaper but slower and less efficient, so they still require substantial amounts of DRAM.

The researchers’ device runs on what the researchers call a “sort-reduce” algorithm, which solves a major issue with using flash as the primary storage source: waste.

Graph analytics systems require access to nodes that may be very far from one another across a massive, sparse graph structure. Systems generally request direct access to, say, 4 to 8 bytes of data to update a node’s value. DRAM provides that direct access very quickly. Flash, however, only accesses data in 4- to 8-kilobyte chunks, but still only updates a few bytes. Repeating that access for every request while jumping across the graph wastes bandwidth. “If you need to access the entire 8 kilobytes, and use only 8 bytes and toss the rest, you end up throwing 1,000 times performance away,” Jun says.

The sort-reduce algorithm instead takes all direct access requests and sorts them in sequential order by identifiers, which show the destination of the request — such as grouping together all updates for node A, all for node B, and so on. Flash can then access kilobyte-sized chunks of thousands of requests at once, making it far more efficient.

To further save computation power and bandwidth, the algorithm simultaneously merges the data into the smallest groupings possible. Whenever the algorithm notes matching identifiers, it sums those into a single data packet — such as A1 and A2 becoming A3. It continues doing so, creating increasingly smaller packets of data with matching identifiers, until it produces the smallest possible packet to sort. This drastically reduces the amount of duplicate requests to access.

Using the sort-reduce algorithm on two large graphs, the researchers reduced the total data that needed to be updated in flash by about 90 percent.

Offloading computation

The sort-reduce algorithm is computation-intensive for a host computer, however, so the researchers implemented a custom accelerator in the device. The accelerator acts as a midway point between the host and flash chips, executing all computation for the algorithm. This offloads so much power to the accelerator that the host can be a low-powered PC or laptop that manages sorted data and executes other minor tasks.

“Accelerators are supposed to help the host compute, but we’ve come so far [with the computations] that the host becomes unimportant,” Arvind says.

“The MIT work shows a new way to perform analytics on very large graphs: Their work exploits flash memory to store the graphs and exploits ‘field-programmable gate arrays’ [custom integrated circuits] in an ingenious way to perform both the analytics and the data processing required to use flash memory effectively,” says Keshav Pingali, a professor of computer science at the University of Texas at Austin. “In the long run, this may lead to systems that can process large amounts of data efficiently on laptops or desktops, which will revolutionize how we do big-data processing.”

Because the host can be so low-powered, Jun says, a long-term goal is to create a general-purpose platform and software library for consumers to develop their own algorithms for applications beyond graph analytics. “You could plug this platform into a laptop, download [the software], and write simple programs to get server-class performance on your laptop,” he says.

Source : MIT news

Software Engineering / Storage system for ‘big data’ dramatically speeds access to information

« on: June 11, 2018, 01:42:40 PM »

As computers enter ever more areas of our daily lives, the amount of data they produce has grown enormously.

But for this “big data” to be useful it must first be analyzed, meaning it needs to be stored in such a way that it can be accessed quickly when required.

Previously, any data that needed to be accessed in a hurry would be stored in a computer’s main memory, or dynamic random access memory (DRAM) — but the size of the datasets now being produced makes this impossible.

So instead, information tends to be stored on multiple hard disks on a number of machines across an Ethernet network. However, this storage architecture considerably increases the time it takes to access the information, according to Sang-Woo Jun, a graduate student in the Computer Science and Artificial Intelligence Laboratory (CSAIL) at MIT.

“Storing data over a network is slow because there is a significant additional time delay in managing data access across multiple machines in both software and hardware,” Jun says. “And if the data does not fit in DRAM, you have to go to secondary storage — hard disks, possibly connected over a network — which is very slow indeed.”

Now Jun, fellow CSAIL graduate student Ming Liu, and Arvind, the Charles W. and Jennifer C. Johnson Professor of Electrical Engineering and Computer Science, have developed a storage system for big-data analytics that can dramatically speed up the time it takes to access information.

The system, which will be presented in February at the International Symposium on Field-Programmable Gate Arrays in Monterey, Calif., is based on a network of flash storage devices.

Flash storage systems perform better at tasks that involve finding random pieces of information from within a large dataset than other technologies. They can typically be randomly accessed in microseconds. This compares to the data “seek time” of hard disks, which is typically four to 12 milliseconds when accessing data from unpredictable locations on demand.

Flash systems also are nonvolatile, meaning they do not lose any of the information they hold if the computer is switched off.

In the storage system, known as BlueDBM — or Blue Database Machine — each flash device is connected to a field-programmable gate array (FPGA) chip to create an individual node. The FPGAs are used not only to control the flash device, but are also capable of performing processing operations on the data itself, Jun says.

“This means we can do some processing close to where the data is [being stored], so we don’t always have to move all of the data to the machine to work on it,” he says.

What’s more, FPGA chips can be linked together using a high-performance serial network, which has a very low latency, or time delay, meaning information from any of the nodes can be accessed within a few nanoseconds. “So if we connect all of our machines using this network, it means any node can access data from any other node with very little performance degradation, [and] it will feel as if the remote data were sitting here locally,” Jun says.

Using multiple nodes allows the team to get the same bandwidth and performance from their storage network as far more expensive machines, he adds.

The team has already built a four-node prototype network. However, this was built using 5-year-old parts, and as a result is quite slow.

So they are now building a much faster 16-node prototype network, in which each node will operate at 3 gigabytes per second. The network will have a capacity of 16 to 32 terabytes.

Using the new hardware, Liu is also building a database system designed for use in big-data analytics. The system will use the FPGA chips to perform computation on the data as it is accessed by the host computer, to speed up the process of analyzing the information, Liu says.

“If we’re fast enough, if we add the right number of nodes to give us enough bandwidth, we can analyze high-volume scientific data at around 30 frames per second, allowing us to answer user queries at very low latencies, making the system seem real-time,” he says. “That would give us an interactive database.”

As an example of the type of information the system could be used on, the team has been working with data from a simulation of the universe generated by researchers at the University of Washington. The simulation contains data on all the particles in the universe, across different points in time.

“Scientists need to query this rather enormous dataset to track which particles are interacting with which other particles, but running those kind of queries is time-consuming,” Jun says. “We hope to provide a real-time interface that scientists can use to look at the information more easily.”

Kees Vissers of programmable chip manufacturer Xilinx, based in San Jose, Calif., says flash storage is beginning to be seen as a replacement for both DRAM and hard disks. “Historically, computer architecture had to have a particular memory hierarchy — cache on the processors, DRAM off-chip, and then hard disks — but that whole line is now being blurred by novel mechanisms of flash technology.”

The work at MIT is particularly interesting because the team has optimized the whole system to work with flash, including the development of novel hardware interfaces, Vissers says. “This means you get a system-level benefit,” he says.

Source : MIT News

Software Engineering / Teaching chores to an artificial agent

« on: June 11, 2018, 01:40:13 PM »

For many people, household chores are a dreaded, inescapable part of life that we often put off or do with little care. But what if a robot assistant could help lighten the load?

Recently, computer scientists have been working on teaching machines to do a wider range of tasks around the house. In a new paper spearheaded by MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) and the University of Toronto, researchers demonstrate “VirtualHome,” a system that can simulate detailed household tasks and then have artificial “agents” execute them, opening up the possibility of one day teaching robots to do such tasks.

For more info visit: http://news.mit.edu/2018/mit-csail-teaching-chores-artificial-agent-0530

Software Engineering / Tomorrow's Cities...

« on: June 11, 2018, 01:35:21 PM »

It is a terrifying vision of the future - a robot police officer with dark eyes and no discernible mouth that can identify criminals and collect evidence.

The robocop, complete with police hat to give it that eerie uncanny valley feel, was shown off outside the world's tallest tower, Burj Khalifa, in Dubai, last June.

But since then what has it done? And is Dubai's love affair with robotics any more than just PR for a country desperate to be at the cutting edge of technology?

For more visit: https://www.bbc.com/news/technology-41268996

Software Engineering / Calling all artists: Time to try AR, Adobe says

« on: June 11, 2018, 01:33:28 PM »

At the Minnesota Street Project, a gallery in San Francisco's hip and fast-developing Dogpatch district, Adobe Systems is trying to kick-start AR as a new medium for artists. Exhibits placed swirling dragons and fluttering butterflies into empty rooms, showed digitized people on miniature real-world accompanied by their animated dreams, and translated a singing opera singer into a wall-size face blasting glowing particles into space.

Augmented reality, which overlays digital imagery onto our view of the real world, could be either a techno-curiosity or the next big thing. Adobe, which announced a new AR creation tool called Aero at Apple's Worldwide Developers Conference this week, is in the next big thing camp.

for more visit : https://www.cnet.com/news/calling-all-artists-time-to-try-ar-adobe-says/