Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Module LRU Cache #3703

Open
bretg opened this issue May 22, 2024 · 1 comment
Open

Module LRU Cache #3703

bretg opened this issue May 22, 2024 · 1 comment

Comments

@bretg
Copy link
Contributor

bretg commented May 22, 2024

The committee discussed providing caching services to modules in the context of #3512

We agreed that another important component would be a controlled cache local to Prebid Server. Some modules may have a local cache hit ratio high enough to help overall latency.

Requirements:

  1. use of a local cache must be optional - something that module vendors and host companies decide.
  2. the amount of data each module is allowed to store locally must be configurable.
  3. the backing store URL used by each module must be configurable and support a macro for retrieval key

Note that the 51Degrees module implemented an LRU cache that might be applicable more broadly:

https://github.com/51Degrees/Java-Device-Detection/blob/cbc9a1acf55a863054530152fe42daa9bde2ed07/device-detection-core/src/main/java/fiftyone/mobile/detection/cache/LruCache.java#L67

Looking for community members to help flesh out a technical proposal.

@muuki88
Copy link

muuki88 commented May 25, 2024

Hi,

Sorry I couldn't make it to the last PMC and I promised to give some feedback on the caching matter.

a controlled cache local to Prebid Server.

I assume that local means same network/data center and additionally, but optional, in-memory. And as this is very heavy on implementation details, I would argue that it's okay if the Java and Go versions have no feature parity.

Prebid Server Java

I can only speak from the prebid-server-java perspective. However I assume that there are equally suitable alternatives in goland.

In Java there's already https://github.com/ben-manes/caffeine as an in-memory Caching implementation available. It's being used in the VendorListService to cache vendorlists to some extend.

https://github.com/prebid/prebid-server-java/blob/04bfffe117b053b5f624eab7cd72a85576e7e77e/src/main/java/org/prebid/server/privacy/gdpr/vendorlist/VendorListService.java#L238-L240

There are some hand-written cache systems that might benefit from it as

https://github.com/prebid/prebid-server-java/blob/04bfffe117b053b5f624eab7cd72a85576e7e77e/src/main/java/org/prebid/server/settings/SettingsCache.java#L22-L23

I would argue that we should

  • leverage the existing in-memory library ( Caffeine ), which has a very extensive feature set
  • expose some of the API like retention time, size
  • provide different AsyncCacheLoader instances as described here Customizable Backends ben-manes/caffeine#321 (comment) that are configurable via settings (url, name, etc.)

This caching component should be so generic it can be used in modules, but also in the rest of the code base, e.g. for the VendorList, geo lookups or stored impressions.

This is a rough sketch on how it might look like, specifically for modules.

modules:
  my-module-1:
    cache:
      storage: memory
      max-size: 10000

  my-module-2:
    # this configuration is modelled as its own data class that can be instantiated from a settings property
    cache:
      storage: redis
      host: redis.mynetwork.local
      port: 8467
      # eviction strategies https://github.com/ben-manes/caffeine/wiki/Eviction
      max-size: 100000
      retention: 3600

A module should be able to instantiate a cache from it's configuration. Something like

var cacheConfig = CacheConfig.fromSettings(...);
var cache = new Cache(cacheConfig);

The Cache implementation requires only two methods. put and get. Now, here this is probably a matter of programming taste how to define those. There are a couple of options, that come to mind

  1. Force every cache implementation down to the least common denominator, which is ByteBuffer or Array<Byte>. At least for the value. Keys maybe force to be strings
  2. Add a decoder/encoder typeclass to the put and get methods, which provide the necessary logic to de/encode values. This is a common design pattern in function languages (see https://circe.github.io/circe/quickstart.html# )

Option two may look like this

// scala... sorry, I'm faster in this

// result is either an exception, or nothing
type Result = Either[CacheException, Unit]

trait Encoder[T] {
   def encode(value: T): ByteBuffer // or some other low level primitive
}

trait Decoder[T] {
   def decode(value: ByteBuffer): Either[T, DecodeException)
}

trait Cache[Key,Value] {
  def put[Value: Encoder](key: String, value: Value): Future[Result]
  
  // result can be optional if not found
  def get[Value: Decoder](key: String]: Future[Option[Value]]
}

The second options requires module maintainers to write Decoder and Encoder instances for their values they want to persist, which makes it trivial to unit test those.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Needs Requirements
Development

No branches or pull requests

2 participants