Scala: Draw table to console

Tokenize it. I'd start with looking at making a few case objects and classes so that you produce a tokenized list which can be operated on for display purposes:

sealed trait TableTokens{
  val width: Int
}
case class Entry(value: String) extends TableTokens{
  val width = value.length
}
case object LineBreak extends TableTokens{
  val width = 0
}
case object Div extends TableTokens{
  val width = 1
}

So then you can form certain constraints with some sort of row object:

case class Row(contents: List[TableTokens]) extends TableTokens{
  val width = contents.foldLeft(0)((x,y) => x = y.width)
}

Then you can check for constraits and things like that in an immutable fashion. Perhaps creating methods for appending tables and alignment...

case class Table(contents: List[TableTokens])

That means you could have several different variants of tables where your style is different from your structure, a la HTML and CSS.


Ton of thanks for the Tabulator code!

There is a modification for Spark dataset tabular printing.

I mean you can print DataFrame content or pulled result set, like

Tabulator(hiveContext.sql("SELECT * FROM stat"))
Tabulator(hiveContext.sql("SELECT * FROM stat").take(20))

The second one will be without header of course, for DF implementation you can set how many rows to pull from Spark data frame for printing and do you need header or not.

 /**
 * Tabular representation of Spark dataset.
 * Usage:
 * 1. Import source to spark-shell:
 *   spark-shell.cmd --master local[2] --packages com.databricks:spark-csv_2.10:1.3.0 -i /path/to/Tabulator.scala
 * 2. Tabulator usage:
 *   import org.apache.spark.sql.hive.HiveContext
 *   val hiveContext = new HiveContext(sc)
 *   val stat = hiveContext.read.format("com.databricks.spark.csv").option("header", "true").option("inferSchema", "true").option("delimiter", "\t").load("D:\\data\\stats-belablotski.tsv")
 *   stat.registerTempTable("stat")
 *   Tabulator(hiveContext.sql("SELECT * FROM stat").take(20))
 *   Tabulator(hiveContext.sql("SELECT * FROM stat"))
 */
object Tabulator {

  def format(table: Seq[Seq[Any]], isHeaderNeeded: Boolean) : String = table match {
    case Seq() => ""
    case _ => 
      val sizes = for (row <- table) yield (for (cell <- row) yield if (cell == null) 0 else cell.toString.length)
      val colSizes = for (col <- sizes.transpose) yield col.max
      val rows = for (row <- table) yield formatRow(row, colSizes)
      formatRows(rowSeparator(colSizes), rows, isHeaderNeeded)
  }

  def formatRes(table: Array[org.apache.spark.sql.Row]): String = {
    val res: Seq[Seq[Any]] = (for { r <- table } yield r.toSeq).toSeq
    format(res, false)
  }

  def formatDf(df: org.apache.spark.sql.DataFrame, n: Int = 20, isHeaderNeeded: Boolean = true): String = {
    val res: Seq[Seq[Any]] = (for { r <- df.take(n) } yield r.toSeq).toSeq
    format(List(df.schema.map(_.name).toSeq) ++ res, isHeaderNeeded)
  }

  def apply(table: Array[org.apache.spark.sql.Row]): Unit = 
    println(formatRes(table))

  /**
   * Print DataFrame in a formatted manner.
   * @param df Data frame
   * @param n How many row to take for tabular printing
   */
  def apply(df: org.apache.spark.sql.DataFrame, n: Int = 20, isHeaderNeeded: Boolean = true): Unit =
    println(formatDf(df, n, isHeaderNeeded))

  def formatRows(rowSeparator: String, rows: Seq[String], isHeaderNeeded: Boolean): String = (
    rowSeparator :: 
    (rows.head + { if (isHeaderNeeded) "\n" + rowSeparator else "" }) :: 
    rows.tail.toList ::: 
    rowSeparator :: 
    List()).mkString("\n")

  def formatRow(row: Seq[Any], colSizes: Seq[Int]) = {
    val cells = (for ((item, size) <- row.zip(colSizes)) yield if (size == 0) "" else ("%" + size + "s").format(item))
    cells.mkString("|", "|", "|")
  }

  def rowSeparator(colSizes: Seq[Int]) = colSizes map { "-" * _ } mkString("+", "+", "+")

}

I've pulled the following from my current project:

object Tabulator {
  def format(table: Seq[Seq[Any]]) = table match {
    case Seq() => ""
    case _ => 
      val sizes = for (row <- table) yield (for (cell <- row) yield if (cell == null) 0 else cell.toString.length)
      val colSizes = for (col <- sizes.transpose) yield col.max
      val rows = for (row <- table) yield formatRow(row, colSizes)
      formatRows(rowSeparator(colSizes), rows)
  }

  def formatRows(rowSeparator: String, rows: Seq[String]): String = (
    rowSeparator :: 
    rows.head :: 
    rowSeparator :: 
    rows.tail.toList ::: 
    rowSeparator :: 
    List()).mkString("\n")

  def formatRow(row: Seq[Any], colSizes: Seq[Int]) = {
    val cells = (for ((item, size) <- row.zip(colSizes)) yield if (size == 0) "" else ("%" + size + "s").format(item))
    cells.mkString("|", "|", "|")
  }

  def rowSeparator(colSizes: Seq[Int]) = colSizes map { "-" * _ } mkString("+", "+", "+")
}

scala> Tabulator.format(List(List("head1", "head2", "head3"), List("one", "two", "three"), List("four", "five", "six")))
res1: java.lang.String = 
+-----+-----+-----+
|head1|head2|head3|
+-----+-----+-----+
|  one|  two|three|
| four| five|  six|
+-----+-----+-----+

If you want it somewhat more compact. Bonus: left aligned and padded with 1 char on both sides. Based on the answer by Duncan McGregor (https://stackoverflow.com/a/7542476/8547501):

def formatTable(table: Seq[Seq[Any]]): String = {
  if (table.isEmpty) ""
  else {
    // Get column widths based on the maximum cell width in each column (+2 for a one character padding on each side)
    val colWidths = table.transpose.map(_.map(cell => if (cell == null) 0 else cell.toString.length).max + 2)
    // Format each row
    val rows = table.map(_.zip(colWidths).map { case (item, size) => (" %-" + (size - 1) + "s").format(item) }
      .mkString("|", "|", "|"))
    // Formatted separator row, used to separate the header and draw table borders
    val separator = colWidths.map("-" * _).mkString("+", "+", "+")
    // Put the table together and return
    (separator +: rows.head +: separator +: rows.tail :+ separator).mkString("\n")
  }
}

scala> formatTable(Seq(Seq("head1", "head2", "head3"), Seq("one", "two", "three"), Seq("four", "five", "six")))
res0: String =
+-------+-------+-------+
| head1 | head2 | head3 |
+-------+-------+-------+
| one   | two   | three |
| four  | five  | six   |
+-------+-------+-------+