As for why the second function is so much faster, that is because it uses a technique called dynamic programming. Dynamic programming basically means that you store the result of the function for each required input. Then the algorithm never needs to recompute the result of a function that has already been calculated.
In the case of the ackermann function it means that you only have to do O(m * n) operations to calculate ackermann(m,n). Which is much faster than the first approach.
The way that dynamic programming is implemented in that example code is by using lazy lists. In Haskell every value is immutable, so when for example acks !! (m - 1) !! 1 has finished computing, its result value will remain in the list. When that same location is queried again it will not have to recompute that value.
This way of implementing dynamic programming probably takes slightly more time than O(m * n) because the list indexing takes linear time. You could use an m by n array to store the values to make it even faster.
But the ackermann function was invented to show the existence of extremely slow functions so implementing a dynamic programming algorithm kind of defeats its purpose.