PixBlog 6 - The land of 14ers [Colorado]

  In the midst of the DeCaLiBron - Mt. Democrat to the right, Mt. Bross to the far left[Alma, Colorado]

  Moment of contemplation - left (Mt.Democrat) or right(Mt.Cameron) [Cameron-Democrat saddle]

Heavy legs and tired lungs - Summit of Mt. Democrat(14154 feet !!)

Football field at 14k feet !! - Summit of Mt. Cameron

A water-filled kite on the ground - Looking down at Kite Lake Trailhead [Alma, Colorado]

Summit Lake below Mt.Evans

Looking down Mt.Evans Road and Scenic Byway [Summit of Mt. Evans]

The scramble awaits - Looking up Mt. Bierstadt summit [Clear Creek, Colorado]

Sun-lit sawtooth ridge - Looking down from summit of Mt. Bierstadt

Of fleeting lakes and perennial ridges - Summit of Mt. Bierstadt

Scary Sawtooth ridge from the side 

For some lines are drawn pretty fine - The Continental Divide

Road to Aspen aint that way !! [Independence Pass, Colorado]

El famous Maroon Bells - [Maroon Bells, Colorado]

Beautiful Breck - [Main St, Breckenridge, Colorado]


Spark Graphx Example with RDF data - Modelling IMDB data to a graph

Had some time to deep dive into GraphX and as it is only supported in Scala, was a good experience. Coming from experimenting on graph databases, graphx seemed an easy platform and makes one not worry about sharding/partitioning your data. The graphx paper and bob ducharme's blog seemed good starters to getting started with playing on graphx.
So I managed to get all Indian movie data for 10 different languages(Hindi, Marathi, Punjabi, Telugu, Tamil, Bhojpuri, Kannada, Gujarati, Bengali, Malayalam) with the actors and directors associated with the films.
Now came the part of building the relationships between the above entities. GraphX uses VertexId associated with each vertex in the graph which is Long(64 bit). So I went about creating the same for all entities and replacing them in the movie file for easier parsing later when creating edges(relationships). The essential part which makes GraphX super easy and super fun is that the partitioning takes place on vertices,i.e., the vertices go along with the edge when edges are on a different node but the VertexID associated with that vertex will still be the same as any other node. The essential syntax to consider is -

For Vertex -> RDD[(VertexId, (VertexObject)) 
In my case, the vertex RDD is -> RDD[(VertexId, (String, String))

For Edge -> Edge(srcVertexId, dstVertexId, "<edgeValueString>")
You can also have a URI or other data types as your edgeValue(can be a weight to associate to the edge when you happen to be choosing the destination to choose based on edge value).

These two when combined together in this line create the graph object for us -
val graph = Graph(entities,relationships)

                                                          Graph depiction of data

So, I ended up connecting a movie to all its actors and back, its director and back and the language the movie was in and back. This led to some half a million edges for my graph(in essence half of them are edges but I made two edges between all sets of entities and movie). Also, I added literal properties to the movie(which are not nodes) such as imdbRating and potentialAction(webUrl of Imdb for that movie). Literal properties can also be thought of adding to the vertexObject directly if the Object is a class with the properties as different variables. The code to construct and query the graph is below. The main program can be replaced by a scalatra webserver and be used for the same but I was more focused on the graph functionality.

New Document
import org.apache.spark.{SparkConf, SparkContext}
import org.apache.spark.SparkContext._
import org.apache.spark.graphx._
import org.apache.spark.rdd.RDD
import scala.io.Source
import org.json4s._
import org.json4s.jackson.JsonMethods._
import scala.collection.mutable.ListBuffer

object MoviePropertyGraph {
  val baseURI = "http://kb.rootchutney.com/propertygraph#"

  class InitGraph(val entities:RDD[(VertexId, (String, String))], val relationships:RDD[Edge[String]],
                  val literalAttributes:RDD[(Long,(String,Any))]){
    val graph = Graph(entities,relationships)

    def getPropertyTriangles(): Unit ={
          // Output object property triples
       graph.triplets.foreach( t => println(
         s"<$baseURI${t.srcAttr._1}> <$baseURI${t.attr}> <$baseURI${t.dstAttr._1}> ."
       ))
    }

    def getLiteralPropertyTriples(): Unit = {
      entities.foreach(t => println(
        s"""<$baseURI${t._2._1}> <${baseURI}role> \"${t._2._2}\" ."""
      ))
    }

    def getConnectedComponents(): Unit = {
      val connectedComponents = graph.connectedComponents()
      graph.vertices.leftJoin(connectedComponents.vertices) {
        case(id, entity, component) => s"${entity._1} is in component ${component.get}"
      }.foreach{ case(id, str) => println(str) }
    }

    def getMoviesOfActor(): Unit = {
      println("Enter name of actor:")
      val actor = Console.readLine()
      graph.triplets.filter(triplet => (triplet.srcAttr._1.equals(actor)) && triplet.attr.equals("inFilm"))
        .foreach(t => println(t.dstAttr._1))
    }

    def getMoviesOfDirector(): Unit = {
      println("Enter name of director:")
      val director = Console.readLine()
      graph.triplets.filter(triplet => (triplet.srcAttr._1.equals(director)) && triplet.attr.equals("movie"))
        .foreach(t => println(t.dstAttr._1))
    }

    def getMoviesInLanguage(): Unit = {
      println("Enter language:")
      val lang = Console.readLine()
      graph.triplets.filter(triplet => (triplet.dstAttr._1.equals(lang)) && triplet.attr.equals("inLanguage"))
        .foreach(t => println(t.srcAttr._1))
    }

    def getTopMoviesBeginningWith(): Unit = {
      println("Enter prefix/starting tokens:")
      val prefix = Console.readLine()
      graph.vertices.filter(vertex => (vertex._2._1.startsWith(prefix)) && vertex._2._2.equals
        ("movie"))
        .distinct()
        .leftOuterJoin(literalAttributes.filter(literalTriple => (literalTriple._2._1.equals("imdbRating"))))
        .sortBy(element => element._2._2.get._2.asInstanceOf[Double], false)
        .foreach(t => println(t._2._1 +" " + t._2._2.get._2.asInstanceOf[Double]))
    }

    def getActorsInMovie(): Unit = {
      println("Enter movie:")
      val movie = Console.readLine()
      graph.triplets.filter(triplet => (triplet.srcAttr._1.equals(movie)) && triplet.attr.equals("actor"))
        .foreach(t => println(t.dstAttr._1))
    }
  }

  def main(args: Array[String]) {
    val sc = new SparkContext("local", "MoviePropertyGraph", "127.0.0.1")
    val movieFile = ""
    val actorFile = ""
    val directorFile = ""
    val languageFile = ""

    val movies = readJsonsFile(movieFile, "movie", sc)
    val actors = readCsvFile(actorFile, "actor", sc)
    val directors = readCsvFile(directorFile, "director", sc)
    val languages = readCsvFile(languageFile, "language", sc)

    val entities = movies.union(actors).union(directors).union(languages)
    val relationships: RDD[Edge[String]] = readJsonsFileForEdges(movieFile, sc)
    val literalAttributes = readJsonsFileForLiteralProperties(movieFile, sc)
    val graph = new InitGraph(entities, relationships, literalAttributes)
// Some supported functionality in graphx
//    graph.getPropertyTriangles()
//    graph.getLiteralPropertyTriples()
//    graph.getConnectedComponents()

    var input = 0
    do {
      println("1. Movies of actor \n2. Movies of director \n3. Movies in language \n4. Top Movies beginning with " +
        "\n5. Actors in movie \n6. Exit\n")
      input = Console.readInt()
      input match {
        case 1 => graph.getMoviesOfActor()
        case 2 => graph.getMoviesOfDirector()
        case 3 => graph.getMoviesInLanguage()
        case 4 => graph.getTopMoviesBeginningWith()
        case 5 => graph.getActorsInMovie()
        case 6 => "bye !!!"
        case something => println("Unexpected case: " + something.toString)
      }
    } while (input != 6);
    sc.stop
  }

  def readJsonsFile(filename: String, entityType: String, sparkContext:SparkContext):  RDD[(VertexId, (String, String))] = {
    try {
     val entityBuffer = new ListBuffer[(Long,(String,String))]
      for (line <- case="" catch="" classof="" entitybuffer.append="" entitybuffer="" entitytype="" ex:="" exception="" filename="" getlines="" head.tolong="" head="" jsondocument="" movieid="" name="" return="" source.fromfile="" sparkcontext.parallelize="" tring="" val=""> println("Exception in reading file:" + filename + ex.getLocalizedMessage)
        return sparkContext.emptyRDD
    }
  }

  def readCsvFile(filename: String, entityType: String, sparkContext:SparkContext): RDD[(VertexId, (String, String))] = {
    try {
      val entityBuffer = new ListBuffer[(Long,(String,String))]
      for (line <- -="" .map="" .split="" 1="" array="" case="" entitybuffer.appendall="" filename="" getlines="" k="" line.length="" line.substring="" source.fromfile="" split="" v=""> (k.substring(2, k.length-1), v.substring(1, v.length).toLong)}
          .map { case(k,v) => (v,(k,entityType))})
      }
      return sparkContext.parallelize(entityBuffer)
    } catch {
      case ex: Exception => println("Exception in reading file:" + filename + ex.getLocalizedMessage)
        return sparkContext.emptyRDD
    }
  }

  def readJsonsFileForEdges(filename: String, sparkContext:SparkContext): RDD[Edge[String]] = {
    try {
      val relationshipBuffer = new ListBuffer[Edge[String]]
      for (line <- actor.tolong="" actor="" actors="" case="" catch="" classof="" dge="" director="" ex:="" exception="" filename="" for="" getlines="" head.tolong="" infilm="" inlanguage="" jsondocument="" movie="" movieid="" nt="" println="" relationshipbuffer.append="" relationshiprdd.count="" relationshiprdd="" return="" source.fromfile="" tring="" val=""> println("Exception in reading file:" + filename + ex.getLocalizedMessage)
        return sparkContext.emptyRDD
    }
  }

  def readJsonsFileForLiteralProperties(filename: String, sparkContext:SparkContext): RDD[(Long,(String,Any))] = {
    try {
      val literalsBuffer = new ListBuffer[(Long,(String,Any))]
      for (line <- case="" catch="" classof="" ex:="" exception="" filename="" getlines="" head.tolong="" head="" imdbrating="" jsondocument="parse(line)" literalsbuffer.append="" literalsbuffer="" movieid="" ouble="" potentialaction="" return="" source.fromfile="" sparkcontext.parallelize="" tring="" val=""> println("Exception in reading file:" + filename + ex.getLocalizedMessage)
        return sparkContext.emptyRDD
    }
  }
}


The function "getTopMoviesBeginningWith" is the coolest of all functions as it sorts the entities based on literal properties(imdb) joined into the same RDD and provides a lame autosuggest for movies starting with a prefix.
My actors, director and languages files were in the form of csv containing name and vertexID values that I had pre-generated. For example,
'shah rukh khan': 75571848106
'subhash ghai': 69540999294
'hindi': 10036785003, 'telugu': 61331709614
My movie file was in the form of a json containing all this connected data. For example -
{"inLanguage": 10036785003,
  "director": 52308324411,
  "movieId": "0112870",
  "potentialAction": "http://www.imdb.com/title/tt0112870/",
  "name": "dilwale dulhania le jayenge",
  "imdbRating": 8.3,
 "actors": [75571848106, 19194912859, 79050811553, 37409465222, 44878963029, 10439885041, 85959578654, 31779460606, 39613049947, 29336046116, 64948799477, 71910110134, 1911950262, 19092081589, 77693233734]
}

If you happen to be needing these files, do let me know and I can email them. Last but not the least, I found building my project using SBT really easier than non SBT method as it gave a gradle like method to build projects.
Here is my sample sdt file that I used to build the project -

name := "PropertyGraph"

version := "1.0"

scalaVersion := "2.10.4"

connectInput in run := true

// additional libraries
libraryDependencies ++= Seq(
  "org.apache.spark" %% "spark-core" % "1.5.0" % "provided",
  "org.apache.spark" %% "spark-sql" % "1.5.0",
  "org.apache.spark" %% "spark-hive" % "1.5.0",
  "org.apache.spark" %% "spark-streaming" % "1.5.0",
  "org.apache.spark" %% "spark-streaming-kafka" % "1.5.0",
  "org.apache.spark" %% "spark-streaming-flume" % "1.5.0",
  "org.apache.spark" %% "spark-mllib" % "1.5.0",
  "org.apache.spark" %% "spark-graphx" % "1.5.0",
  "org.apache.commons" %% "commons-lang3" % "3.0",
  "org.eclipse.jetty"  %% "jetty-client" % "8.1.14.v20131031",
  "org.json4s" %% "json4s-native" % "{latestVersion}",
  "org.json4s" %% "json4s-jackson" % "{latestVersion}"
)

The experimentation did not give me a real time computation(like Neo4j or Virtuoso graph databases) of the graph as spark processing takes time on a single spark node and I safely assume increasing number of nodes will increase the time due to parallel distribution and fetching. I hope this continues to get better with future versions of spark(soon to be out 1.6 !!)

PixBlog 5 - The Pacific Northwest [Northern Oregon & Olympic National Park]

            Cannon Beach [Haystack Rock]

 Canada across the strait!![Port Angeles]

                                   Another one of those roads to perdition [Olympic National Park]

Turn up that heat [Heart O' the Hills road, Olympic National Park] 

Tunnel in a tunnel [Heart O' the Hills road, Olympic National Park]  

  Ridgy-Ridge [Hurricane ridge, Olympic National Park]

[Heart O' the Hills road, Olympic National Park]   

 [Elwha Rainforest, Olympic National Park]  

 [Lake Crescent, Olympic National Park]

Lost trees up-top [Rialto beach]

All possible permutations of tokens in a line given each token has two(n) possible values

So I came across an interesting problem and the problem was essentially a modification of permutation problems that I had seen before. I did end up writing the brute force solution which was not good enough and had to revisit the problem for optimizing it.
Here is my best attempt at describing the problem -
Each token in a line is sent through a encryption module and gives out a transformed version of the token. The encryption module has some bugs in it and hence not all words may always get encrypted/transformed, find all possible representations of the same line so that the representations cover all possible permutations if each token was successfully transformed or not.

I hope I did not make the problem statement too confusing but to make things a bit more clearer, here is an example :-
Line -> “W1 W2"
“W1" transforms to a transformed word called “TW1”
“W2” transforms to a transformed word called “TW2”

Hence, considering that both words may not always get transformed, here are all the possible permutations of the above line.
Permutation1 -> “W1 W2”
Permutation2 -> “TW1 W2”
Permutation3 -> "W1 TW2”
Permutation4 -> “TW1 TW2”

Similarly, if the line was “W1 W2 W3” with the respective transforms being “TW1”, “TW2” and “TW3” then the possible permutations are :-
“W1 W2 W3”
“TW1 W2 W3”
“TW1 W2 TW3”
“TW1 TW2 W3"
“W1 W2 TW3"
“W1 TW2 W3"
“W1 TW2 TW3"
“TW1 TW2 TW3”

Solution 1(Brute force) :-
This was the first approach I took to get things done.
Create a hashset or a treeset of results and go over each possible combination with the given token at hand. To top it off, I made the solution worse by making it recursive. So for example, if you take the first token in a three token line, then get all the possible combinations of the remaining two tokens and append them to W1 and TW1. Then, move onto W2, and get all possible permutations of (W1, TW1) and (W3,TW3) and attach them around W2 and TW2.
As can be seen, this solution is tedious and wasteful. We end up recalculating many of the already finished permutations and there needs to be an easy way to avoid that.
The worst case is when you have many tokens around the token at hand and one has to recursively calculate the combinations on both sides of the token and then again permutate with the given token. So, one creates three subproblems :-
i) Calculate permutation before the token(P1)
ii) Calculate permutation after the token(P2)
iii) Add in Permutations of P1,token,P2

I wish not to rewrite code for such a lame approach though the time complexity will be O(N X Maximum Number of Permutations before a token X Maximum Number of Permutations after a token) where N is number of tokens
which basically amounts to O(N!(N-1)!….1!).
Enough said about how bad it was.
Unfortunately, this method only worked until like 10 or 11 tokens on my laptop and my laptop gave up after that.

Solution 2(Use a tree/graph approach):-
This solution was thought about it so as to avoid recalculating a path already taken. Once all possible paths of a tree are taken, one can be assured that all possible permutations have been calculated. The way of thinking about this solution is to go have a directed graph with two start points W1 and TW1. And then do a Depth First Search for last token and its transform(in this case W3 and TW3). I think of this as a problem having two sources(W1and TW1) and two sinks(W3 and TW3).
Here is a diagram to explain the same.
fig - Graph Approach

The interesting part is that using this approach, I ended up using a little more space than I normally would to keep tabs of the nodes I have already visited.
This approach had O(V * E) time complexity which is O(N x 2N) which is essentially O(N^2) with O(2N) space complexity to maintain the visited nodes.

Solution 3(Using a binary bit based approach) :-
Thanks to Randy Gobbel for noting a pattern from the above graph and suggesting a uniquely simple approach without any extra space needed. Randy noted that each place in the resultant permutation has two possible values(W or TW). This is more or less like 0 or 1. Hence, the permutations amount for a three token line amount to -
000, 001, 010, 011, 100, 101, 110, 111
where 0 -> the token itself
          1 -> the token’s transform
As a result, this solution boils down to simply iterating over all possible values from 0 to 2 raised to the power of the number of tokens. Below is the code for the approach. The method signature involves a list of tokens and a Map/Dict containing transforms for each token.

1:  public List getPossiblePermutations3(List words, Map transformMap) { 
2:        if (words == null || words.isEmpty()) 
3:            return null; 
4:        List permutationList = new ArrayList<>(); 
5:        for (int i = 0; i < (int)Math.pow(2,words.size()); ++i) { 
6:            StringBuilder permutation = new StringBuilder(); 
7:            for (int j = 0; j < words.size(); ++j) { 
8:                //Check if bit is set 
9:                if((i & (int)Math.pow(2, j)) == 0) 
10:                    permutation.append(words.get(j) + " "); 
11:               else 
12:                    permutation.append(transformMap.get(words.get(j)) + " "); 
13:            } 
14:            //Trim out end spaces/blank characters 
15:            permutationList.add(permutation.toString().trim()); 
16:        } 
17:        return permutationList; 
18:    } 
The time complexity for this approach is O(N x 2^N) with no space implications.

Thanks for reading…

Elasticsearch Delete Issues

Problem :­
We have some types of documents which have to be deleted daily based on the date in them and if it has expired. An alert from another tool tells us that if such an old dated document appears in the search results which is filtered out by the end service.
So, for the past month, we saw that the alert was getting triggered every night between 12am and 7:15am and then it disappears (no more data from yesterday)

Remember, that these documents are deleted daily using the XDELETE query to the elasticsearch cluster.

Experiments :­
We run an hourly job to delete old data and logically, one expects the old data to disappear as soon as the first delete job is run after 12am, but we keep seeing the alert with the log showing that a result from yesterday was getting triggered.

I increased the job frequency to run every 15 minutes with a "refresh" on the index after each job and yet, the splunk alert would show up until 7:15am every day. The pattern that is seen is between 6am and 7am, the number of old documents begin to decrease and finally there are none after 7:15am. Even "optimize_expunge_deletes" does not decrease the time these documents show up in the elasticsearch results.

Possible causes :­
• The second last comment on this forum reveals that elasticsearch marks documents for deletes and by default, merges only happen on an index when there is 10% or more documents to be added or deleted. As we index daily, probably 7:15am is the time when a merge happens and the documents get deleted.

• Though, we see the docID in the logs, when searching for the documents in the elasticsearch cluster with the ID leads to no results and hence, this points to that different results shown from the head than from the client API.

Possible solution :­
Note, the index does not use TTL(time­to­live) value for the documents but as per the elasticsearch forums, people still see this issue after using TTL.
So supposedly, this is the way lucene and elasticsearch want to be efficient by limiting merges as a merge is an expensive operation from memory’s perspective.
The likely solution given by elasticsearch forum experts is to create new index and swap indices whenever possible(this means swapping indices everyday for us) as a new index has faster searches avoiding the deleted documents in the lookup (overkill !!).

Considering, most of the elasticsearch users talk about documents being present in the millions, sometimes it is not feasible to index and come up with a new index daily. So, an alternate short approach can be to update the document with a field of deleted set as true and filter out such documents.

Elasticsearch document inconsistency

Problem :­
A particular search query would show results sometimes and sometimes won't.


Root cause analysis :­
A pattern observed with the hits and misses was of 011100 occuring in a repeated way.
Elasticsearch by default delegates the queries amongst the nodes till it finds the results and hence, if you want to direct your query to a particular node, a "preference" parameter can be set to "nodeID"

The way to get a nodeID (the one that I found) is ­
curl ­XGET 'localhost:9200/_nodes'

wherein every node has a ID associated in the json header.
The nodeID can be used as ­
searchRequestBuilder.setPreference("_only_node:xyz");

Using this preference, one can narrow down the shard that was showing this inconsistent behaviour for the given document. The shard stats showed that it had the same number of "numDocs" and hence it seemed for some reason, the document was not searchable on that shard.

Found a similar issue in elasticsearch groups ­

Resolution :­
The elasticsearch logs did not show any issue on the nodes and that "refresh" and "flush" on the index did not produce any exception/error in the logs and did not resolve the issue.
As per the issue thread, people had recreated their indexes or stuck to using the primary shard(which wasnt an option as recreating the index would take a lot of time and hitting the primary shard only always gave us no hits)
The no­-downtime way was to bring the replica shards down to 0 and create them again. (which luckily fixed the issue for us)

curl ­XPUT 'localhost:9200/my_index/_settings' ­d '{
  "index" : {
    "number_of_replicas" : 0
  }
}'

Run an indexing job to fill back in the documents lost in the process.

Possible Causes :­
We had some downtimes in the cluster due to high CPU usage and maybe such outages would have caused an inconsistency in the shards. Though, we still kept getting consistent results for most of the queries we had.
So the guess is that any such issue can cause elasticsearch shards to become inconsistent and the resolution is variable/dependent on the number of documents in the index and the possible downtime that can be taken.

Sinking the Submarine Puzzle

So every now and then one comes across some puzzle which makes you delve deep into number theory and though things appear weird at the start, they end up being deterministic... or precisely certainly solvable though in infinity !!

Here is the "Sinking the Submarine" problem -
[I got this puzzle from Alex Mikhailov and a little help from Yuval]

An enemy submarine is somewhere on the number line (consider only integers for this problem). It is moving at some rate (again, integral units per minute). You know neither its position nor its velocity. You can launch a torpedo each minute at any integer on the number line. If the the submarine is there, you hit it and it sinks. You have all the time and torpedoes you want. You must sink this enemy sub - devise a strategy that is guaranteed to eventually hit the enemy sub.

Solution :-
Looking at the problem, one feels a simple strategy can help us reach the solution. But then one goes into asking random questions, like 
- Do we know the initial position?
- Should we even care about the initial position?

Remember that the submarine does not have an acceleration. So atleast, some randomness has already been constrained. So, lets go about proving that the submarine can be sinked certainly.
In this puzzle, we have combination of values for velocity and displacement of submarine on the number line based on a function between initial position and velocity. Remember, that the speed(velocity) remains constant. So, the submarine would be travelling the same distance between time t0 to t1 as it would between time t1 and t2.
The possible values of velocity are 0,1,2,..... infinity
The possible values of starting displacement are 0,1,2,...infinity.
So, we need to traverse the combinations of starting displacement and velocity in such a way that we hit the submarine somewhere before infinity. Using a simple function, such as 
currentPostion = startingDisplacement + (time * velocity) 
we can brute force each combination value and sink the submarine before infinity.
Here is one of the lame ways to iterate over all possible combinations of velocity and time


Had some free time, so decided to code it in.
New Document

1: import java.util.Scanner;
2: 
3: public class SinkTheSubmarine {
4:
5: public static void printNumShotsRequired(long startPosition, long velocity) {
6:  if (startPosition == 0 && velocity == 0) {
7:   System.out.println("Submarine hit on 1 attempt"); 
8:   return;
9:  } 
10:  long subPosition = startPosition;
11:  long time = 1;
12:  long startDisp = 0;
13:  long vel = 0;
14:  long shotPosition = 0;
15:  boolean hit = false;
16:  boolean changeStartDisp = false;
17:  boolean changeVel = false;
18:  
19:  startDisp++;
20:  changeStartDisp=true;
21:  do { 
22:   System.out.println("startDisp:"+startDisp+" vel:"+vel);
23:   subPosition = startPosition + (time * velocity);
24:   shotPosition = startDisp + (time * vel);
25:   if (shotPosition == subPosition) {
26:    System.out.println("Submarine hit on " + (time + 1) + " attempt"); 
27:    hit = true;
28:   }
29:   if (changeStartDisp && !changeVel) {
30:    if (startDisp == 0) {
31:     //Move down
32:     vel++;
33:     changeStartDisp = false;
34:     changeVel = true;
35:    } else {
36:     //Move left bottom
37:     vel++;
38:     startDisp--;
39:    }
40:   }else if(!changeStartDisp && changeVel) {
41:    if (vel == 0) {
42:     //Move right
43:     startDisp++;
44:     changeStartDisp = true;
45:     changeVel = false;
46:    } else {
47:     //Move right up
48:     startDisp++;
49:     vel--;
50:    }
51:   }  
52:   time++;
53:  } while(!hit);
54:  return;
55: }
56: 
57: public static void main(String[] args) {
58:  Scanner scanner = new Scanner(System.in);
59:  System.out.println("Enter starting location of submarine:");
60:  long startstartDisplacement = scanner.nextInt();
61:  System.out.println("Enter velocity of submarine:");
62:  long velocity = scanner.nextInt();
63:  System.out.println(startstartDisplacement + " " + velocity);
64:  printNumShotsRequired(startstartDisplacement, velocity);
65: }
66:}

Remember, that if we had acceleration in the picture, then we would have to change our function with something like Newton's second law of motion -
s = ut + (at2/2)
and will have to iterate over combinations for starting displacement, velocity and acceleration and hence, even in that case, sinking the submarine is a certainty.