Why is topological sort needed for Longest Path in Directed Acyclic Graph?

If we don't sort it, we don't know which adjacent vertex to choose first and it may lead to a situation where we use distance of a vertex v to update distances of its adjacent vertices adj[v], but after that, the distance of vertex v gets updated, so vertices from adj[v] could also get bigger distances, but we won't visit them anymore.

Example based on the graph you have referenced (http://www.geeksforgeeks.org/wp-content/uploads/LongestPath.png):
Let's say that at this step:
Step 1
Say, we start to traverse the graph from vertex '0', and we choose vertex with distance 6 (instead of vertex with distance 2, which we would have chosen if we had used topological order). Already processed vertices are green, vertex currently being processed is red:
Step 2
We have updated the distance of the last vertex to 7 and we won't increase it, however if we had visited vertex with distance 2 in previous step, the distance of this vertex would have been 10: Step 3


If we can keep track of visited nodes, it should be possible to use a recursive DFS and some memoization.

Start from starting node. For each neighbor, calculate (distance to neighbor + distance from neighbor to goal). Take the max of those, memoize it as the max from this node, and return it.

Basically, if you know the max distance from your neighbors to the goal, you know the max distance from you to the goal. And if you memoize, you won't visit any node more than once.