Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate event, baseflow, rising limb filter algorithms #227

Open
mgdenno opened this issue Aug 20, 2024 · 1 comment
Open

Investigate event, baseflow, rising limb filter algorithms #227

mgdenno opened this issue Aug 20, 2024 · 1 comment
Assignees
Milestone

Comments

@mgdenno
Copy link
Contributor

mgdenno commented Aug 20, 2024

Several of the proposed signature metrics we have identified as important require filtering the timeseries based on identifying some pattern in the timeseries that requires a wholistic view of the timeseries not just checking a single value against some criteria (i.e., a threshold). We need to understand what is currently being done in this area (event, baseflow, rising limb) as well as what may be required in the future such that we can generalize this process to handle not only current needs but future ones too (that should be the aim anyway - in reality it can be hard to anticipate future needs).

@mgdenno
Copy link
Contributor Author

mgdenno commented Sep 13, 2024

I think we should investigate the "Series to Series" pandas_udf() for this. See: https://spark.apache.org/docs/3.4.2/api/python/reference/pyspark.sql/api/pyspark.sql.functions.pandas_udf.html

If this works the way I think it does, it could be an easy way to add event detection with existing Python code (from HydroTools). We could then later see if doing it all natively in PySpark would be more performant.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant