Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Topics in headers #3008

Closed
patmmccann opened this issue Aug 7, 2023 · 20 comments
Closed

Topics in headers #3008

patmmccann opened this issue Aug 7, 2023 · 20 comments
Assignees

Comments

@patmmccann
Copy link

patmmccann commented Aug 7, 2023

Prebid Server should read the Chrome-provided Sec-Browsing-Topics header and convert it to OpenRTB.

Requirements

  1. Prebid Server must allow the host company to define a domain to be used as the name for topics segments from the header. e.g. a new auction.privacysandbox.topicsdomain configuration.
  2. Prebid Server must look for and parse the Sec-Browsing-Topics header
  3. Validate the fields in the header. See below for the format details. A header may contain 0-10 fields. (e.g. (SEG);v=chrome.1:2:3)
    1. if taxonomy_version is not an integer, or is not in the range of 1-10, error
    2. no validation on model_version
    3. segments must be integers greater than 0 and less than MAXINT, error
    4. note that it's ok for different fields to have different taxonomy_version or model_version.
    5. for a given field to be considered valid, it must have at least one segment, a valid taxonomy_version, and a model_version.
    6. If there are more than 10 fields in the header, the 11th and later are considered invalid
    7. Any padding field is skipped and not validated. A padding field is defined as one that starts with "();p="
    8. If any validation fails, that header field should be ignored with a warning when in debug mode
  4. After validation passes, create an ORTB user.data[] segment from each field according to these rules
    1. set the "name" to "TOPICS_DOMAIN". If TOPICS_DOMAIN is not specified, name will be empty.
    2. segtax is 600+(taxonomy_version-1) where taxonomy_version must be between 1 and 10 inclusive.
    3. segclass is the model_version
    4. the segment array comes from the segment list in the header field
  5. PBS must search for an existing user.data[] entry where name="TOPICS_DOMAIN" AND segtax matches what was determined above AND segclass= model_version from the header.
    1. If an existing entry was found that matches all 3 aspects, merge its segment list with the header segments
    2. Else create a new user.data entry
  6. When the Sec-Browsing-Topics was seen on the request, Prebid Server should always set the Observe-Browsing-Topics response header to a value of ?1. Even when the request ended in an error.

Sec-Browsing-Topics Header Format

A couple of examples from https://patcg-individual-drafts.github.io/topics/ suggested a set of assumptions that has since been confirmed by Google:

Example 9 sets Sec-Browsing-Topics to the value "(100);v=chrome.1:1:20, (200);v=chrome.1:1:40, (300);v=chrome.1:1:60, ();p=P"

Three returned topics, and underlying epochs have three different versions:
(100);v=chrome.1:1:20, (200);v=chrome.1:1:40, (300);v=chrome.1:1:60, ();p=P

This is 4 sets of data:

  • (100);v=chrome.1:1:20 - segment 100 in taxonomy version 1, model version 20
  • (200);v=chrome.1:1:40 - segment 200 in taxonomy version 1, model version 40
  • (300);v=chrome.1:1:60 - segment 300 in taxonomy version 1, model version 60
  • ();p=P - padding, ignore

Another example is relevant:

Two returned topics, and underlying epochs have same versions:
(1 2);v=chrome.1:1:2, ();p=P000000000

This is 2 sets of data:

  • (1 2);v=chrome.1:1:2 - segments 1 and 2 in taxonomy version 1, model version 2
  • ();p=P000000000 - padding, ignore

The parsing of the version string (e.g. v=chrome.1:1:2) is defined by this statement in the document:

the version is the result of concatenating « configurationVersion, taxonomyVersion, modelVersion » using ":"

So in otherwords, the general format is v=chrome.D:TAXONOMY_VERSION:MODEL_VERSION.

  • We ignore the chrome.D part
  • TAXONOMY_VERSION-1+600 is the segtax
  • MODEL_VERSION is the segclass

Translating this to ORTB:

{
    "user": {
        "data": [{
            "name": "TOPICS_DOMAIN",
            "ext": {
                 "segtax": 600+TAXONOMY_VERSION-1,
                 "segclass": "MODEL_VERSION"
            },
            "segment": [
            		{ id: "SEGMENT 1" },
            		{ id: "SEGMENT 2" }
                 ]
        }]
    }
}

Examples

The following examples are based on the following HTTP header

Sec-Browsing-Topics: (186);v=chrome.1:1:2206021246, (265);v=chrome.1:1:2206021246, ();p=P000000000

Example 1 - no incoming user.data

Create a user.data field that looks like:

{
    "user": {
        "data": [{
            "name": "TOPICS_DOMAIN",
            "ext": {
                 "segtax": 600,
                 "segclass": "2206021246"
            },
            "segment": [
            		{ id: "186" },
            		{ id: "265" }
                 ]
        }]
    }
}

Example 2 - non-conflicting user.data

Create a user.data field that looks like:

{
    "user": {
        "data": [{
            "name": "TOPICS_DOMAIN",    // this came in on the request. leave it alone
            "ext": {
                 "segtax": 4
            },
            "segment": [
            		{ id: "111" },
            		{ id: "222" }
                 ]
        },{
            "name": "chrome.com",         // this is the new PBS-generated entry
            "ext": {
                 "segtax": 600,
                 "segclass": "2206021246"
            },
            "segment": [
            		{ id: "186" },
            		{ id: "265" }
                 ]
        }]
    }
}

Example 3 - overlapping segments

The incoming ORTB contains:

{
    "user": {
        "data": [{
            "name": "TOPICS_DOMAIN",        // overlap, so segments merged below
            "ext": {
                 "segtax": 600,
                 "segclass": "2206021246"
            },
            "segment": [
            		{ id: "111" },
            		{ id: "265" }
                 ]
        }]
    }
}

the result will be

{
    "user": {
        "data": [{
            "name": "TOPICS_DOMAIN",
            "ext": {
                 "segtax": 600,
                 "segclass": "2206021246"
            },
            "segment": [
            		{ id: "111" },
            		{ id: "265" },      // segment came in on both the ORTB and the header, not duplicated
            		{ id: "186" }       // unique topic from header appended
                 ]
        }]
    }
}

Example 4 - multiple taxonomies

This example is based on the following HTTP header. Note that the taxonomy of the 2nd field is "2" instead of "1" as in the above examples.

Sec-Browsing-Topics: (186);v=chrome.1:1:2206021246, (265);v=chrome.1:2:77777, ();p=P000000000

Create a user.data field that looks like:

{
    "user": {
        "data": [{
            "name": "TOPICS_DOMAIN",
            "ext": {
                 "segtax": 600,
                 "segclass": "2206021246"
            },
            "segment": [
            		{ id: "186" }
                 ]
        },{
            "name": "TOPICS_DOMAIN",
            "ext": {
                 "segtax": 601,
                 "segclass": "77777"
            },
            "segment": [
            		{ id: "265" }
                 ]
        }]
    }
}

---- Original text ----

Prebid.js will soon be providing topics in request headers using this method

https://developer.chrome.com/en/docs/privacy-sandbox/topics/#use-headers-to-access-and-observe-topics

Prebid server should copy them into the openRTB request like so https://github.com/google/ads-privacy/blob/master/proposals/topics-rtb/README.md

It looks like this Sec-Browsing-Topics: 186;version="chrome.1:1:2206021246";config_version="chrome.1";model_version="2206021246";taxonomy_version="1", 265;version="chrome.1:1:2206021246";config_version="chrome.1";model_version="2206021246";taxonomy_version="1"

From the topics dev guide To have the topics in the Sec-Browsing-Topics request header marked by the browser as observed, but also to include the current page visit in the user's next epoch top topic calculation, the server's response has to include Observe-Browsing-Topics: ?1. Here's a JavaScript example using setHeader(): res.setHeader('Observe-Browsing-Topics', '?1');

@bretg
Copy link
Contributor

bretg commented Aug 7, 2023

Thanks @patmmccann . Questions:

  1. What segtax should Prebid use - always 600?
  2. Can we overwrite whatever else is in the user.data.segtax=600 structure, or should we merge with whatever already might be there?
  3. I'm given to understand the values that come in on the header might be different depending on the domain of the Prebid Server. e.g. the browser might give Magnite's topics when sending to *.rubiconproject.com. Are these topics appropriate to merge globally for all bidders, or are they only appropriate for an adapter actually owned by the Prebid Server host?

@bretg
Copy link
Contributor

bretg commented Aug 8, 2023

Discussed with Patrick.

Requirements:

  1. segtax is 600+(taxonomy_version-1) where taxonomy_version must be between 1 and 10 inclusive.
  2. if taxonomy_version is not an integer, or is not 1-10, the header should be ignored with a warning when in debug mode
  3. PBS must support a new configuration option "host company segment domain".
  4. PBS must search for an existing entry where name=" host company segment domain" AND segtax matches what was determined in step 1 AND segclass= model_version from the header AND method=2.
    1. If an entry was found that matches all 4 aspects, merge it's segment list with the topics in the header (the first field)
    2. Else create a new user.data entry containing that information

The following examples are given the headers above.

Example 1 - no incoming user.data

Create a user.data field that looks like:

{
    "user": {
        "data": [{
            "name": "hostcompanysegdomain.com",
            "ext": {
                 "segtax": 600,
                 "segclass": "2206021246"
            },
            "segment": [
            		{ id: "186" },
            		{ id: "265" }
                 ]
        }]
    }
}

Example 2 - non-conflicting user.data

Create a user.data field that looks like:

{
    "user": {
        "data": [{
            "name": "segdomain.com",    // this came in on the request. leave it alone
            "ext": {
                 "segtax": 4
            },
            "segment": [
            		{ id: "111" },
            		{ id: "222" }
                 ]
        },{
            "name": "hostcompanysegdomain.com",      // this is the new PBS-generated entry
            "ext": {
                 "segtax": 600,
                 "segclass": "2206021246"
            },
            "segment": [
            		{ id: "186" },
            		{ id: "265" }
                 ]
        }]
    }
}

Example 3 - completely overlapping entry

The incoming ORTB contains:

{
    "user": {
        "data": [{
            "name": "hostcompanysegdomain.com",        // overlap, so segments merged below
            "ext": {
                 "segtax": 600,
                 "segclass": "2206021246"
            },
            "segment": [
            		{ id: "111" },
            		{ id: "265" }
                 ]
        }]
    }
}

The result will be

{
    "user": {
        "data": [{
            "name": "hostcompanysegdomain.com",
            "ext": {
                 "segtax": 600,
                 "segclass": "2206021246"
            },
            "segment": [
            		{ id: "111" },
            		{ id: "265" },      // segment came in on both the ORTB and the header, not duplicated
            		{ id: "186" }       // unique topic from header appended
                 ]
        }]
    }
}

Over to @patmmccann to verify the reqs and examples

@patmmccann
Copy link
Author

It took me a moment to determine how example 3 and 4 were different but I got there, all lgtm

@patmmccann
Copy link
Author

as reference prebid/Prebid.js#10340 adds the header to the pbjs request to pbs on chrome browsers

@bretg
Copy link
Contributor

bretg commented Aug 11, 2023

Updated based on committee discussion to drop method. If the IAB/google end up approving method, we can add it later.

@bretg
Copy link
Contributor

bretg commented Oct 31, 2023

The format for the topics header has changed. Am working through https://patcg-individual-drafts.github.io/topics/#the-sec-browsing-topics-http-request-header-header to get the updated structure.

Note that https://groups.google.com/a/chromium.org/g/topics-api-announce/c/7I-cKaupgLI?pli=1 is where the new format is announced.

@bretg
Copy link
Contributor

bretg commented Oct 31, 2023

Not finding a clear guide to parsing the header, but a couple of examples from https://patcg-individual-drafts.github.io/topics/ suggest a set of assumptions.

Example 9 sets Sec-Browsing-Topics to the value "(100);v=chrome.1:1:20, (200);v=chrome.1:1:40, (300);v=chrome.1:1:60, ();p=P"

Three returned topics, and underlying epochs have three different versions:
(100);v=chrome.1:1:20, (200);v=chrome.1:1:40, (300);v=chrome.1:1:60, ();p=P

This is 4 sets of data:

  • (100);v=chrome.1:1:20 - segment 100 in taxonomy version 1, model version 20
  • (200);v=chrome.1:1:40 - segment 200 in taxonomy version 1, model version 40
  • (300);v=chrome.1:1:60 - segment 300 in taxonomy version 1, model version 60
  • ();p=P - padding, ignore

Another example is relevant:

Two returned topics, and underlying epochs have same versions:
(1 2);v=chrome.1:1:2, ();p=P000000000

This is 2 sets of data:

  • (1 2);v=chrome.1:1:2 - segments 1 and 2 in taxonomy version 1, model version 2
  • ();p=P000000000 - padding, ignore

The parsing of v=chrome.1:1:2 is defined by this:

the version is the result of concatenating « configurationVersion, taxonomyVersion, modelVersion » using ":".

So the general format is v=chrome.D:TAXONOMY_VERSION:MODEL_VERSION.

Translating this to ORTB is as above.

{
    "user": {
        "data": [{
            "name": "hostcompanysegdomain.com",
            "ext": {
                 "segtax": 600+TAXONOMY_VERSION-1,
                 "segclass": "MODEL_VERSION"
            },
            "segment": [
            		{ id: "SEGMENT 1" },
            		{ id: "SEGMENT 2" }
                 ]
        }]
    }
}

@jdwieland8282
Copy link
Member

user.data.name should be chrome.com.

So
Sec-Browsing-Topics: (1 2);v=chrome.1:1:2, ();p=P000000000

should look like this when converted to oRTB. right?

{
    "user": {
        "data": [{
            "name": "chrome.com",
            "ext": {
                 "segtax": 600,
                 "segclass": "2"
            },
            "segment": [
            		{ id: "1" },
            		{ id: "2" }
                 ]
        }]
    }
}

@bretg
Copy link
Contributor

bretg commented Nov 1, 2023

Updated the description with the updated understanding of how the header is structured.

The outstanding issue is to resolve whether the name field is "chrome.com" or is the host company's segment domain.

@patmmccann
Copy link
Author

patmmccann commented Nov 1, 2023

user.data.name should not be chrome.com; it should be the entity with the topics network, matching https://github.com/google/ads-privacy/tree/master/proposals/topics-rtb

chrome.com is useless information as it is already implied by the segtax

@pm-harshad-mane
Copy link

Thank you, Bret, for revising the description with the updated format specifics. Considering that the Chrome browser shares TOPICS across domains based on their presence on other websites, and segtax 600 explains that these segments are Chrome-provided TOPICS, I believe it would be more appropriate to specify the segment domain of the hosting company rather than using "chrome.com."

@jdwieland8282
Copy link
Member

I disagree, I'll grant that segTax 600 is associated with Chrome and is knowable by a downstream buyer, but that wouldn't hold true for segTax 1-7 which are the standard taxonomies. When a domain uses one of the standard taxonomies the name is used to describe who constructed the segment, a buyer can't know that Acme DMP composed segment 123 using segtax 7 unless it defines itself in the user.data.name.

@rdgordon-index
Copy link

I'm curious how we would ever have overlapping data -- Chrome only ships their topics data in the header, so it should never conflict.

@pm-harshad-mane
Copy link

@rdgordon-index , In a Hybrid configuration featuring the Topics module enabled on the client-side, the Topics module will transmit specific Topics as segments within the PBS request body, while Chrome will also include some Topics in the headers. In such scenario, if the Topics module enables a bidder to offer topics gathered through an iframe/fetch endpoint and the PBS is hosted on the same bidder domain then we can have overlapping

@bretg
Copy link
Contributor

bretg commented Nov 2, 2023

Updated to have the TOPICS_DOMAIN be a host-company configurable setting.

@rdgordon-index
Copy link

rdgordon-index commented Nov 8, 2023

In such scenario, if the Topics module enables a bidder to offer topics gathered through an iframe/fetch endpoint and the PBS is hosted on the same bidder domain then we can have overlapping

And you see this scenario as possible for the Chrome-provided, header-based Topics API taxonomies?

@patmmccann
Copy link
Author

In such scenario, if the Topics module enables a bidder to offer topics gathered through an iframe/fetch endpoint and the PBS is hosted on the same bidder domain then we can have overlapping

And you see this scenario as possible for the Chrome-provided, header-based Topics API taxonomies?

I see it as typical

@bretg
Copy link
Contributor

bretg commented Nov 14, 2023

Updated proposed config from auction.topics-domain to auction.privacysandbox.topicsdomain

@bretg bretg assigned bsardo and unassigned SyntaxNode Dec 20, 2023
pm-nilesh-chate added a commit to pm-nilesh-chate/prebid-server that referenced this issue Jan 10, 2024
@pm-nilesh-chate
Copy link
Contributor

Draft PR: #3393

pm-nilesh-chate added a commit to pm-nilesh-chate/prebid-server that referenced this issue Jan 10, 2024
@SyntaxNode
Copy link
Contributor

Implemented in PBS-Go v2.14.0. Thank you @pm-nilesh-chate for the contribution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

No branches or pull requests

8 participants