“Every problem becomes childish when once it is explained to you”

Monday, August 21, 2017

Using Linear Regression for Stock Market Model

From the previous blog, we got an overview of machine learning & linear regression. Here we can try to address usecase of very basic stock forecasting.

Usecase: In stock market, some stocks tend to change linearly with the entire markets valuation.
That means a particular stock is sensitive to the Market changes. We need to predict the stocks price based on markets valuation.


Approach:
We can use simple linear regression to solve this since we feel that the relationship is linear.

Predicting Equation: y=mx+b 


Lets say the points in Exchange as X
&
lets try predicting the value of Y, a single stock present in the stock exchange X


Use http://www.r-fiddle.org/  for writing the R code.

Lets take the history data set in R first. 

x <- c(350, 320, 410, 370, 290,
330, 340, 350, 340, 370)
y <- c(75, 68, 82, 76, 47, 70,
76, 72, 69, 74)


Now simply apply the linear function lm() on this dataset to find the relationship between these 2 datasets.

relationship <- lm(y~x)

Just print the results

print(relationship)

Coefficients:
(Intercept)            x  
   -15.2784       0.2484  


This will print the coefficents 

Now we can use the predict function( equation of straight line) to predict the results.

inputX <- data.frame(x = 450)
resultY <-  predict(relationship, inputX)
print(resultY)

       1 
96.48034 

This result can be plotted in graph for better understanding.

plot(y~x,col="red")

abline(lm(relationship))


Quickstart to Machine Learning with Simple Linear Regression

Machine learning is a process which learns from given data and produces a model from which some insights can be generated. This can be used for various types of data analytics like predictive analytics. 

There are many languages/tools available to do the statistics with Machine learning. Below are the list of such tools.
Languages
R,Python,Java,Scala etc..
Frameworks
Spark MLLib,Tensorflow,Theano,Caffe etc..

R is the easiest and simplest of them to start the learning. We can use http://www.r-fiddle.org/ an online R console.

If you are a beginner in Machine Learning, understanding the linear regression is the best way to start your journey in machine learning.

Predicting the future by analysing the historical data is the ultimate aim for predictive analytics. Linear regression is one of the statistical methods for achieving it. Even though the model is quite simple, this has been widely used in various industries for Sales/Market Model Prediction,Fitness analytics, Pricing prediction in real estate etc..

Linear regression can be seen as a relationship between 2 variables, Dependent and Independent. There can be either single(simple linear regression) or multiple independent variables can be present(multiple linear regression).

Here we can see how a simple linear regression works..
If the value of a variable changes linearly based on the value of another variable, the relationship between these 2 variables can be called linear relationship.
Let X(independent) & Y(dependent)  be the variables
If the relationship between these  2 variables can be identified, we can predict the value of Y by knowing X.

Equation of straight line helps in finding this linear relationship.

Equation: y= mx+b

m & b are the relationship variables. 

A simple function lm() in R helps to find the relationship ie the values of m & b.


Theory is enough.. Now we can move on to a practical usecase “StockMarket Model” with R

Friday, July 8, 2016

Spark Hello World with Java 8 and Maven





Java8 is a major release for Java in recent times. Support for functional programming makes Java a feature rich programming language with new additions like Lambda expression,new Stream API  etc..Here we can see how the new features of java can be leveraged to write big data applications using the popular spark framework.
             Spark is written in Scala, which is naturally a functional programming language. Eventhough most of the spark libraries can be accessed via its java API, it wasn't really straightforward to write Spark programs with Java7 due to the lack of Functional nature in java7.
Java8 makes it easy to write Spark programs  with its functional features. 

Below program is written in Java8 on Apache Spark. Those who wanted to see the same in Java7 can refer my previousblog.

Eclipse IDE is used to create and run this program.

Follow the below steps to quickly setup a sample project.

  • Create a simple maven project and update pom.xml with below configuration


<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.semanticeyes.sparkhelloworld</groupId>
<artifactId>spark-helloworld</artifactId>
<version>0.0.1-SNAPSHOT</version>
<name>Spark Helloworld</name>
<packaging>jar</packaging>
<repositories>
<repository>
<id>apache</id>
<url>https://repository.apache.org/content/repositories/releases</url>
</repository>
</repositories>
<dependencies>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.10</artifactId>
<version>1.2.0</version>
</dependency>
<dependency>
<groupId>commons-io</groupId>
<artifactId>commons-io</artifactId>
<version>2.4</version>
</dependency>
</dependencies>
<properties>
<java.version>1.8</java.version>
</properties>
<build>
<pluginManagement>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.1</version>
<configuration>
<source>${java.version}</source>
<target>${java.version}</target>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-assembly-plugin</artifactId>
<version>2.2.2</version>
<configuration>
<descriptors>
<descriptor>src/main/assembly/assembly.xml</descriptor>
</descriptors>
</configuration>
</plugin>
</plugins>
</pluginManagement>
</build>
</project>


·Create a HelloWorld Java class

package com.semanticeyes.sparkhelloworld;

import java.util.Arrays;
import java.util.List;

import org.apache.spark.api.java.JavaRDD;
import org.apache.spark.api.java.JavaSparkContext;

public class HelloSpark {
            public static void main(String[] args) {

                        // Local mode
                        JavaSparkContext sc = new JavaSparkContext("local", "HelloSpark");
                        String[] arr = new String[] { "John", "Paul", "Gavin", "Rahul", "Angel" };
                        List<String> inputList = Arrays.asList(arr);
                        JavaRDD<String> inputRDD = sc.parallelize(inputList);
                        inputRDD.foreach(x -> System.out.println(x));

            }
}


Thats it..Go and execute it...