Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when reading json array file? #1071

Open
SimunKaracic opened this issue Feb 22, 2024 · 5 comments
Open

Error when reading json array file? #1071

SimunKaracic opened this issue Feb 22, 2024 · 5 comments
Labels
💎 Bounty bug Something isn't working help wanted Extra attention is needed

Comments

@SimunKaracic
Copy link

Specifically, this file https://github.com/statsbomb/open-data/blob/master/data/competitions.json
The file is formatted as a json array, and I would like to read the file in a streaming fashion.

When opening the file with:

      json.readJsonAs(path)
        .tap(foo => ZIO.logInfo(foo.asArray.toString))
        .runCount

The entire file is read into a a single item, a json list (instead of providing a stream of each item in the list).
It also throws this error, but seems to recover from it:

22:58:34.586 [zio-default-blocking-2] DEBUG zio.json.JsonDecoderPlatformSpecific -- timestamp=2024-02-22T22:58:34.583386+01:00 level=DEBUG thread=zio-fiber-7 message="Fiber zio-fiber-7 did not handle an error" cause=
zio.json.internal.UnexpectedEnd: if you see this a dev made a mistake using OneCharReader

When trying this to read the file as a Stream[Competition]

      ZStream
        .fromPath(path.toPath)
        .via(
          ZPipeline.utf8Decode >>>
            stringToChars >>>
            JsonDecoder[Competition].decodeJsonPipeline(JsonStreamDelimiter.Array)
        )
        .runCount

I get a StackOverflowError

23:00:26.273 [ZScheduler-Worker-9] DEBUG foo.bar.Main.run -- timestamp=2024-02-22T23:00:26.271057+01:00 level=DEBUG thread=zio-fiber-5 message="Fiber zio-fiber-5 did not handle an error" cause=
java.lang.StackOverflowError: null

The stack points directly to the derived Competition class json codec.

Class and codec:

    case class Competition(
        competition_id: Option[Int],
        season_id: Option[Int],
        country_name: Option[String],
        competition_name: Option[String],
        competition_gender: Option[String],
        competition_youth: Option[Boolean],
        competition_international: Option[Boolean],
        season_name: Option[String],
        match_updated: Option[String],
        match_available: Option[String]
    )

    object Competition {
      implicit val decoder: JsonDecoder[Competition] = DeriveJsonDecoder.gen[Competition]
    }

ZIO-json version:
0.6.2

@SimunKaracic
Copy link
Author

Ok so this was a weird one. I was running all the code inside one file, in a Main.scala file.
The stackoverflow error dissapears if I define the Competition class and decoder outside of the main object.

Only this remains, but the result seems to be fine:

20:10:09.146 [zio-default-blocking-2] DEBUG zio.json.JsonDecoderPlatformSpecific -- timestamp=2024-02-23T20:10:09.144538+01:00 level=DEBUG thread=zio-fiber-7 message="Fiber zio-fiber-7 did not handle an error" cause=
zio.json.internal.UnexpectedEnd: if you see this a dev made a mistake using OneCharReader

@fsvehla fsvehla added bug Something isn't working help wanted Extra attention is needed labels Mar 18, 2024
@jdegoes
Copy link
Member

jdegoes commented Apr 25, 2024

/bounty $100

Copy link

algora-pbc bot commented Apr 25, 2024

💎 $100 bounty • ZIO

Steps to solve:

  1. Start working: Comment /attempt #1071 with your implementation plan
  2. Submit work: Create a pull request including /claim #1071 in the PR body to claim the bounty
  3. Receive payment: 100% of the bounty is received 2-5 days post-reward. Make sure you are eligible for payouts

Thank you for contributing to zio/zio-json!

Add a bountyShare on socials

@Andrapyre
Copy link

@SimunKaracic , I cannot reproduce this error, either in tests or in the main class as you mentioned. Could you share the whole Main.scala file by any chance, as well as all relevant environment details regarding your platform, scala version, and zio version (thanks for mentioning the zio json version!).

@SimunKaracic
Copy link
Author

I am also not able to reproduce the bug anymore, as I threw away the original exploratory code.
I guess it would still be nice if we had something like this in zio-json, to support loading JSON arrays from files:

  def readJsonArrayAs[T: JsonDecoder](path: Path): ZStream[Any, Throwable, T] = {
    ZStream
      .fromPath(path)
      .via(
        ZPipeline.utf8Decode >>>
          stringToChars >>>
          JsonDecoder[T].decodeJsonPipeline(JsonStreamDelimiter.Array)
      )
  }

My attempt at reproducing the bug (also tried lowering versions of scala, but it didn't help):
build.sbt

ThisBuild / version := "0.1.0-SNAPSHOT"

ThisBuild / scalaVersion := "3.4.2"

lazy val root = (project in file("."))
  .settings(
    name := "zio-json-reproduce",
    libraryDependencies ++= Seq(
      "dev.zio" %% "zio" % "2.1.5",
      "dev.zio" %% "zio-json" % "0.6.2"
    )
  )

Main.scala:

import zio.*
import zio.json.*
import zio.stream.*

import java.nio.file.{Path, Paths}

object Main extends ZIOAppDefault {

  case class Competition(
                          competition_id: Int,
                          season_id: Int,
                          country_name: String,
                          competition_name: String,
                          competition_gender: String,
                          competition_youth: Boolean,
                          competition_international: Boolean,
                          season_name: String,
                          match_updated: Option[String],
                          match_available: String
                        )

  object Competition {
    implicit val decoder: JsonDecoder[Competition] = DeriveJsonDecoder.gen[Competition]
  }

  private def stringToChars: ZPipeline[Any, Nothing, String, Char] =
    ZPipeline.mapChunks[String, Char](_.flatMap(_.toCharArray))

  val path = "competitions.json"
  val loadsWholeFileIntoArray: ZIO[Any, Throwable, Long] = json.readJsonAs(path)
    .runCount

  def readJsonArrayAs[T: JsonDecoder](path: Path): ZStream[Any, Throwable, T] = {
    ZStream
      .fromPath(path)
      .via(
        ZPipeline.utf8Decode >>>
          stringToChars >>>
          JsonDecoder[T].decodeJsonPipeline(JsonStreamDelimiter.Array)
      )
  }
  
  val iteratesThroughArrayOneByOne =  readJsonArrayAs(Paths.get(path)).runCount

  override def run: ZIO[Any with ZIOAppArgs with Scope, Any, Any] = {
    loadsWholeFileIntoArray.flatMap { c =>
      ZIO.logInfo(s"Array count: ${c}")
    } *> iteratesThroughArrayOneByOne.flatMap { c => ZIO.logInfo(s"Items inside array count: ${c}") }
  }
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
💎 Bounty bug Something isn't working help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

4 participants