Big Data Analytics Introduction 5 Vs of Big Data Volume Velocity Variety Veracity value Big Data Infrastructure ( Slide P38 ) Reliable Distributed File system Data kept in “chunks” spread across machines Each chunk replicated on different machines.(if machine/disk) failure, recovery seamlessly) Bring computation directly to data Chunk servers also servers as compute servers Societal concerns: privacy, algorithmic black boxes, filter bubble. MapReduce Programming model: A programming model, a parallel, data aware, fault-tolerant implementation mappers:( Slide P11 ) The Map function takes an input element as its argument and produces zero or more key-value pairs. The types of keys and values are each arbitrary. Further, keys are not “keys” in the usual sense; they do not have to be unique. Rather a Map task can produce several key-value pairs with the same key, even from the same element. reducers:( Slide P11 ) The Reduce function’s argument is a p
[Interview] URLify: -------------------------------------------------------------------------------------------------------------------------- Question: URLify: Write a method to replace all spaces in a string with ‘%20’, you may assume that the string has sufficient space at the end to hold the additional characters. Example input: ' mr john smith ' output: ' mr %20john%20smith' -------------------------------------------------------------------------------------------------------------------------- Idea 1: Start from the back and start replacing until the character is not ' ', and replace the characters in reverse order. Solution 1: public class Solution{ public String replace(char[] str) { boolean flag = false; StringBuffer sb = new StringBuffer(); for (int i = str.length - 1; i >= 0; i--) { if (str[i] != ' ') flag = true; if (flag == true) { if (str[i] == ' ') { s