{"url":"https://api.github.com/repos/simonw/datasette/issues/2143","repository_url":"https://api.github.com/repos/simonw/datasette","labels_url":"https://api.github.com/repos/simonw/datasette/issues/2143/labels{/name}","comments_url":"https://api.github.com/repos/simonw/datasette/issues/2143/comments","events_url":"https://api.github.com/repos/simonw/datasette/issues/2143/events","html_url":"https://github.com/simonw/datasette/issues/2143","id":1855885427,"node_id":"I_kwDOBm6k_c5unpBz","number":2143,"title":"De-tangling Metadata before Datasette 1.0","user":{"login":"asg017","id":15178711,"node_id":"MDQ6VXNlcjE1MTc4NzEx","avatar_url":"https://avatars.githubusercontent.com/u/15178711?v=4","gravatar_id":"","url":"https://api.github.com/users/asg017","html_url":"https://github.com/asg017","followers_url":"https://api.github.com/users/asg017/followers","following_url":"https://api.github.com/users/asg017/following{/other_user}","gists_url":"https://api.github.com/users/asg017/gists{/gist_id}","starred_url":"https://api.github.com/users/asg017/starred{/owner}{/repo}","subscriptions_url":"https://api.github.com/users/asg017/subscriptions","organizations_url":"https://api.github.com/users/asg017/orgs","repos_url":"https://api.github.com/users/asg017/repos","events_url":"https://api.github.com/users/asg017/events{/privacy}","received_events_url":"https://api.github.com/users/asg017/received_events","type":"User","user_view_type":"public","site_admin":false},"labels":[{"id":2175284252,"node_id":"MDU6TGFiZWwyMTc1Mjg0MjUy","url":"https://api.github.com/repos/simonw/datasette/labels/design","name":"design","color":"083993","default":false,"description":""}],"state":"open","locked":false,"assignees":[],"milestone":null,"comments":24,"created_at":"2023-08-18T00:51:50Z","updated_at":"2023-08-24T18:28:27Z","closed_at":null,"assignee":null,"author_association":"COLLABORATOR","active_lock_reason":null,"sub_issues_summary":{"total":0,"completed":0,"percent_completed":0},"issue_dependencies_summary":{"blocked_by":0,"total_blocked_by":0,"blocking":0,"total_blocking":0},"body":"Metadata in Datasette is a really powerful feature, but is a bit difficult to work with. It was initially a way to add \"metadata\" about your \"data\" in Datasette instances, like descriptions for databases/tables/columns, titles, source URLs, licenses, etc. But it later became the go-to spot for other Datasette features that have nothing to do with metadata, like permissions/plugins/canned queries. \r\n\r\nSpecifically, I've found the following problems when working with Datasette metadata:\r\n\r\n1. Metadata cannot be updated without re-starting the entire Datasette instance.\r\n2. The `metadata.json`/`metadata.yaml` has become a kitchen sink of unrelated (imo) features like plugin config, authentication config, canned queries\r\n3. The Python APIs for defining extra metadata are a bit awkward (the `datasette.metadata()` class, `get_metadata()` hook, etc.)\r\n\r\n## Possible solutions\r\n\r\nHere's a few ideas of Datasette core changes we can make to address these problems. \r\n\r\n### Re-vamp the Datasette Python metadata APIs\r\n\r\nThe Datasette object has a single `datasette.metadata()` method that's a bit difficult to work with. There's also no Python API for inserted new metadata, so plugins have to rely on the `get_metadata()` hook.\r\n\r\nThe `get_metadata()` hook can also be improved - it doesn't work with async functions yet, so you're quite limited to what you can do.\r\n\r\n(I'm a bit fuzzy on what to actually do here, but I imagine it'll be very small breaking changes to a few Python methods)\r\n\r\n### Add an optional `datasette_metadata` table\r\n\r\nDatasette should detect and use metadata stored in a new special table called `datasette_metadata`. This would be a regular table that a user can edit on their own, and would serve as a \"live updating\" source of metadata, than can be changed while the Datasette instance is running.\r\n\r\nNot too sure what the schema would look like, but I'd imagine:\r\n\r\n```sql\r\nCREATE TABLE datasette_metadata(\r\n  level text,\r\n  target any,\r\n  key text,\r\n  value any,\r\n  primary key (level, target)\r\n)\r\n```\r\n\r\nEvery row in this table would map to a single metadata \"entry\".\r\n\r\n- `level` would be one of \"datasette\", \"database\", \"table\", \"column\", which is the \"level\" the entry describes. For example, `level=\"table\"` means it is metadata about a specific table, `level=\"database\"` for a specific database, or `level=\"datasette\"` for the entire Datasette instance.\r\n- `target` would \"point\" to the specific object the entry metadata is about, and would depend on what `level` is specific. \r\n  - `level=\"database\"`: `target` would be the string name of the database that the metadata entry is about. ex `\"fixtures\"`\r\n  - `level=\"table\"`: `target` would be a JSON array of two strings. The first element would be the database name, and the second would be the table name. ex `[\"fixtures\", \"students\"]`\r\n  - `level=\"column\"`: `target` would be a JSON array of 3 strings: The database name, table name, and column name. Ex `[\"fixtures\", \"students\", \"student_id\"`]\r\n- `key` would be the type of metadata entry the row has, similar to the current \"keys\" that exist in `metadata.json`. Ex `\"about_url\"`, `\"source\"`, `\"description\"`, etc\r\n- `value` would be the text value of be metadata entry. The literal text value of a description, about_url, column_label, etc\r\n\r\nA quick sample:\r\n\r\nlevel | target | key | value\r\n-- | -- | -- | --\r\ndatasette | NULL | title | my datasette title...\r\ndb | fixtures | source | <description of my database source>\r\ntable | [\"fixtures\", \"students\"] | label_column | student_name\r\ncolumn | [\"fixtures\", \"students\", \"birthdate\"] | description | <description of the fixtures.students.birthdate column>\r\n\r\nThis `datasette_metadata` would be configured with other tools, and hopefully not manually by end users. Datasette Core could also offer a UI for editing entries in `datasette_metadata`, to update descriptions/columns on the fly.\r\n\r\n### Re-vamp `metadata.json` and move non-metadata config to another place\r\n\r\nThe motivation behind this is that it's awkward that `metadata.json` contains config about things that are not strictly metadata, including:\r\n\r\n- Plugin configuration\r\n- [Authentication/permissions](https://docs.datasette.io/en/latest/authentication.html#access-permissions-in-metadata) (ex the `allow` key on datasettes/databases/tables\r\n- Canned queries. might be controversial, but in my mind, canned queries are application-specific code and configuration, and don't describe the data that exists in SQLite databases. \r\n\r\nI think we should move these outside of `metadata.json` and into a different file. The `datasette.json` idea in  #2093 may be a good solution here: plugin/permissions/canned queries can be defined in `datasette.json`, while `metadata.json`/`datasette_metadata` will strictly be about documenting databases/tables/columns. \r\n","closed_by":null,"reactions":{"url":"https://api.github.com/repos/simonw/datasette/issues/2143/reactions","total_count":0,"+1":0,"-1":0,"laugh":0,"hooray":0,"confused":0,"heart":0,"rocket":0,"eyes":0},"timeline_url":"https://api.github.com/repos/simonw/datasette/issues/2143/timeline","performed_via_github_app":null,"state_reason":null,"pinned_comment":null}