Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SPDX IDs #22

Open
TinoDidriksen opened this issue Dec 15, 2020 · 7 comments
Open

SPDX IDs #22

TinoDidriksen opened this issue Dec 15, 2020 · 7 comments
Labels
good first issue Good for newcomers help wanted Extra attention is needed

Comments

@TinoDidriksen
Copy link
Member

We should add https://spdx.dev/ids/ to all our files.

@TinoDidriksen TinoDidriksen added help wanted Extra attention is needed good first issue Good for newcomers labels Dec 15, 2020
@mr-martian
Copy link
Contributor

Are all our files pretty much the same license so that this could be automated?

  1. Look up comment char by file suffix
  2. Check for shebang
  3. Add SPDX line on first or second line depending on step 2

@TinoDidriksen
Copy link
Member Author

Pretty much, but a bunch of files are currently lacking any copyright information. Almost none of the XML or other language data files have it.

@mr-martian
Copy link
Contributor

Does this look good?

import sys
import os
xml_comment = '<!-- %s -->\n'
hfst_comment = '! %s\n'
cpp_comment = '// %s\n'
other_comment = '# %s\n'
license = 'SPDX-License-Identifier: GPL-3.0-or-later'
lines = []
with open(sys.argv[1]) as f:
    ext = os.path.splitext(sys.argv[1])[-1]
    lines = f.readlines()
    if ext in ['.dix', '.lsx', '.lrx', '.t1x', '.t2x', '.t3x', '.arx']:
        if '<?xml' not in lines[0].lower():
            lines.insert(0, '<?xml version="1.0" encoding="UTF-8"?>\n')
        lines.insert(1, xml_comment % license)
    elif ext in ['.lexc', '.twol', '.twoc']:
        lines.insert(0, hfst_comment % license)
    elif ext in ['.h', '.cc']:
        lines.insert(0, cpp_comment % license)
    else:
        n = 0
        if lines[0].startswith('#!/'):
            n = 1
        lines.insert(n, other_comment % license)
with open(sys.argv[1], 'w') as f:
    f.write(''.join(lines))

@TinoDidriksen
Copy link
Member Author

Yes, but not all repos or files are GPL-3.0-or-later. Some are GPL-2.0-or-later (e.g. https://github.com/apertium/apertium) or GPL-3.0-only (e.g. https://github.com/apertium/apertium-fin). And this can be mixed in a repo (e.g. autogen.sh is often GPL-2.0-or-later) Strictly speaking, some are even CC0-1.0, though we wrap that in GPL-3.0-or-later.

So it's scriptable, but not quite that trivially. Script has to examine current file for existing license, and otherwise repo COPYING file.

As a helper script that authors can run on their own repos, it'd be good.

@mr-martian
Copy link
Contributor

I knew about -fin and was assuming that whatever this spit out would require manual checking, but would at least be mostly correct for language repos.

@prathamaror
Copy link

can i start working on this issue

@TinoDidriksen
Copy link
Member Author

can i start working on this issue

Anyone can work on any issue - just do the work and open a PR in the relevant repo. But this issue will in the end touch nearly all our 592 repositories. You can certainly make a helper script that authors can run in repositories, though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants