Docsity
Docsity

Prepara i tuoi esami
Prepara i tuoi esami

Studia grazie alle numerose risorse presenti su Docsity


Ottieni i punti per scaricare
Ottieni i punti per scaricare

Guadagna punti aiutando altri studenti oppure acquistali con un piano Premium


Guide e consigli
Guide e consigli

Big Data Implications: Transforming Society and Industries, Dispense di Elementi di Informatica

StatisticsData ScienceComputer ScienceMachine LearningArtificial Intelligence

The far-reaching implications of big data, going beyond marketing and consumer goods to profoundly impact governments, politics, and daily life. Forbes identifies five ways big data will influence us: spending, voting, studying, health, and privacy. The document also discusses the challenges and opportunities of handling large volumes of data, such as the shift from exact to approximate, and the rise of machine learning and ai. Ibm watson and alphago are highlighted as examples of intelligent systems derived from big data.

Cosa imparerai

  • How is Big Data changing the nature of politics and our daily life?
  • How does Big Data help computers pass the Turing test?
  • What are the ways Big Data will influence us according to Forbes?
  • What are the technological implications of the shift from exact to approximate data?
  • What are the issues with the concept n = All in Big Data?

Tipologia: Dispense

2018/2019

Caricato il 24/02/2019

gessica_giangrasso
gessica_giangrasso 🇮🇹

4.5

(17)

20 documenti

1 / 43

Toggle sidebar

Documenti correlati


Anteprima parziale del testo

Scarica Big Data Implications: Transforming Society and Industries e più Dispense in PDF di Elementi di Informatica solo su Docsity! Big Data Implications Giovanni Giuffrida December 5, 2018 1 The Big Data based new society 2 New society: Flying objects 5 New society: Self driving cars It’s happening! 6 New society: Self driving cars 7 Big Data is surely helping computers passing the Turing test! 10 Big Data revolution implies big shifts • From exact to approximate • From sampling to all (n = All) • From causality to correlation 11 From exact to approximate • Increasing data size and speed leads to “inexactitude” • Data in database are never “clean”: With small data you can afford to clean those • Good enough is “good enough” with big data • Willing to sacrifice a little accuracy in favor of general trends • Big data transforms figure more into probabilities... And this is ok in many fields 12 n = All • No more sampling!! • Statistics based in the past hundred years on sampling... find the “best” “smallest” “most representative” sample • This was accepted as a matter of life • Reality: poor technology to process ALL data • Some industries developed around this concept, e.g.: Surveys 15 n = All... some issues there • Sampling works well at macro-level • Like a picture: good from the right distance, blurry when close • In a sense, the sample is chosen depending on the “distance” you look at it • You may need to reprocess data with new samples in order to change “distance” • Sampling may not work well for outliers detection 16 From causality to correlation • Big data is about what not why • Stop searching for “causality” • Correlation doesn’t tell us why something is happening • If Millions of cancer patients who drinks orange juice and get one aspirin a day get better... do we really care why? • Prediction based on correlations is central in Big Data • It leads to big (BIG) social implications 17 7. Mexican lemon imports prevent highway deaths. 16 R°=0.97 15.8 156 154 19989 Sources: U.S. NHTSA, DOT HS 810 780 US. Department of Agriculture Total US Highway Fatality Rate 200 250 300 350 400 450 500 550 Fresh Lemons Imported to USA from Mexico (Metric Tons) 20 . Eating organic food causes autism. The real cause of increasing autism prevalence? 25000. . 300000 a Autism 200004 ® Organic Food Sales © ‘200000 9 5 15000 E 9 E 10000 Da 100000 1=0.9971 (p<0.0001) 0 ELOLEELELEEEE Year pasoubelg sjenpjAIpuj 2 Using Internet Explorer leads to murder. Internet Explorer vs Murder Rate 18000 —____—_—_—___m_m—90% 17,200 se 16,400 60% 15,600 14,800 1% 14,000 2006 2007 2008 2009 2010 2011 > Murders in US M_ Internet Explorer Market Share 2 • CYC vs Watson • Two (very) different approaches • CYC was “embedding” knowledge • Watson is able to “learn” from huge amount of data 25 Cyc is an artificial intelligence project that attempts fo assemble a comprehensive ontology and knowiedge base cf everyday common sense Knowierige, with the goal of enabling Al applications to perform human-iike ressoning. The project was started in 1984 by Douglas Lenat at MCC and is developed by the Cycorp company. Parts ol Ihe project are released as OpenCye, mhich provides an API, RDF endpoint: and data dump under an open source lizense. Contents & ove Original author{s) Doules Lenet Developer(s) nitiai release Stable rolease Written in Too Website Cycorm, Inc. 1904: 31 years ago 40113 un® 2012:2 years 290 Uisp. Ove Ontology and Infrence engine vmiovccom a gropeue 26 27 The IBM Jeopardy Challenge represents a milestone in the development of artificial intelligence, and is part of Big Blue's centennis] celebratio “We are at a very special moment in time," said Dr John E. Kelly Il IBM Senior Vice President and Director ofIBM Research. “We are at i moment where computers and computer technology now have approsched humans, We have created a computer system that has the ability to understand natural tiuman language, which is very difficult thing for computersto do.” Named after TRM founder Thomas J. Watson, the supercomputer is one of the most advanced systems vn Rarthund was programmed by 25 IBM scientists over the last four years. Researchers scannod some 200 million pages of content — or the equivalent of about one million books — into Lhe system, including books, movie seripts and entire eneyelopedias. 30 How It Works IEM Watsor Health is improving health by bringing the world's data to cur daily lives. 31 IBM’s Watson – the language-fluent computer that beat the best human champions at a game of the US TV show Jeopardy! – is being turned into a tool for medical diagnosis. Its ability to absorb and analyse vast quantities of data is, IBM claims, better than that of human doctors, and its deployment through the cloud could also reduce healthcare costs. 32 AlphaGo • 3000 years old game • Simple board • Before 2016 it was considered to be “impossible” to model • Many (many) more combinations compared to chess • “the most elegant game that humans have ever invented”; “simple rules that give rise to endless complexity”; “more possible Go positions than there are atoms in the universe” • Mostly based on “intuition” 35 The sharing economy Linking people with surplus goods with people who can make use of them • Natural way to optimize overall stock of cars, bedrooms, etc. • Reduce the need to produce more • Richness redistribution • Transactions made directly between provider and consumer 36 • Policymakers: How to regulate it? • How to collect tax • How to guarantee public safety • How to protect old-style worker categories • Not a clear winning strategy yet 37 Who owns the Big Data? • Rio is the first city to collect real-time data from • Waze (drivers) • Moovit (public transportation) • Strava (cycling) • Daily aggregated view of 110.000 drivers • 60.000 daily incidents reported each day • Huge cost saving: from camera and sensors to smartphones • Better traffic alerting for citizens • Data exchange between Rio and Waze/Moovit • Duying data from Strava 40 Significant social and political implications Big Data are crucial for more and more public sectors • “Silicon Valley” owns the largest portion of worldwide Big Data • “Silicon Valley” knows a lot about our health • “Silicon Valley” knows a lot about urban transportation • “Silicon Valley” knows a lot about our education • “Silicon Valley” knows a lot about travellers and housing 41 Significant social and political implications If “Silicon Valley” could offer basic needs (from health to education to public transportation) why do we maintain our fat government and/or why do we need to pay tax? 42
Docsity logo


Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved