Skip to content

Added kdocs for DataSchema and DataRowSchema#1775

Open
koperagen wants to merge 1 commit intomasterfrom
dataschema-documentation
Open

Added kdocs for DataSchema and DataRowSchema#1775
koperagen wants to merge 1 commit intomasterfrom
dataschema-documentation

Conversation

@koperagen
Copy link
Copy Markdown
Collaborator

I tried to fill in the gap and at the same time provide a new perspective, my Nth attempt to explain dataschema :))

@koperagen koperagen self-assigned this Mar 27, 2026
@koperagen koperagen added the KDocs Improvements or additions to KDocs label Mar 27, 2026
@koperagen koperagen force-pushed the dataschema-documentation branch from 3d9c104 to 7ab74cf Compare March 27, 2026 16:46
* Given the initial schema of the data you read, the compiler plugin will provide a typed result for most operations.
*
* Example:
* ```
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please annotate any code samples with the right file format to get highlights, anywhere you use Markdown ;P

* @DataSchema
* data class Group(
* val id: String,
* val participants: List<Person>
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inconsistent trailing commas

* )
*
* fun main() {
* val url = "https://raw.githubusercontent.com/Kotlin/dataframe/refs/heads/master/data/participants.json"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inconsistent spacing. I might recommend using the @sample KoDEx tag :) similar to Korro, we can include code between // SampleStart and // SampleEnd comments, making sure it compiles and is formatted correctly.

import org.jetbrains.kotlinx.dataframe.api.cast
import org.jetbrains.kotlinx.dataframe.api.convertTo

/**
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd start a bit more generally and introductorily before jumping into exactly what it does.
So: "This annotation marks an interface or data class as 'data schema'" (link to https://kotlin.github.io/dataframe/schemas.html). Then continue with "It's used to generate extension properties, etc.". Gives a bit more context to this key DataFrame component :)


/**
* Annotation to generate extension properties API for a given declaration, according to its properties.
* Annotated declaration should be non-local and non-private interface or a class.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

*Annotated declaration should be a non-local and non-private interface or class.

/**
* Annotation to generate extension properties API for a given declaration, according to its properties.
* Annotated declaration should be non-local and non-private interface or a class.
* The aim here is to provide convenient syntax for working with a dataframe instance right after reading from it CSV, JSON, Databases, Arrow, etc.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

*a convenient syntax

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

*databases

* fun main() {
* val url = "https://raw.githubusercontent.com/Kotlin/dataframe/refs/heads/master/data/participants.json"
* val df = DataFrame.readJson(url).cast<Group>()
* val i: Int = df.id[0] // properties style access to columns and values
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can come up with a better name than i and l right? ;P I know it's not relevant to the example itself, but I feel we should still set a good example with expressive variable names.

* distinct { city } into "cities"
* }
*
* // now compiler plugin uses previous knowledge of `Group` combined with its understanding of aggregate operation
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

*the compiler plugin

* @see [org.jetbrains.kotlinx.dataframe.DataFrame.convertTo]
*/
@Target(AnnotationTarget.CLASS)
public annotation class DataSchema(val isOpen: Boolean = true)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably explain what isOpen does

import org.jetbrains.kotlinx.dataframe.annotations.DataSchema

/**
* Marker interface that's automatically added to classes annotated with [DataSchema]
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... why? (because it can help with .append() etc.) Also, we should specify how this is added (with the compiler plugin) and that it's added only to data classes, not to interfaces (right?) and why.

"Added" -> "Added as supertype"

* data class Person(val name: String, val age: Int)
*
* fun main() {
* val df = dataFrameOf(Person("Alice", 30), Person("Bob", 25))
Copy link
Copy Markdown
Collaborator

@Jolanrensen Jolanrensen Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

single space indent ;P Same story, try @sample :) You could even @sample a piece of code and exclude that from sources in this file. For instance:

@ExcludeFromSources
private interface Sample {

    // to make it compile without the compiler plugin
    fun <T> dataFrameOf(vararg rows: T): DataFrame<T> = TODO()

    // SampleStart
    @DataSchema
    data class Person(val name: String, val age: Int)

    fun main() {
        val df: DataFrame<Person> = dataFrameOf(Person("Alice", 30), Person("Bob", 25))
    }
    // SampleEnd
}

/**
 * Example:
 * @sample [Sample]
 */
public inline fun <reified T : DataRowSchema> dataFrameOf(vararg rows: T): DataFrame<T> =
    rows.asIterable().toDataFrame()

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

append could also do with a small sample like this :)

@Jolanrensen Jolanrensen added this to the 1.0.0-Beta5 milestone Mar 30, 2026
@Jolanrensen
Copy link
Copy Markdown
Collaborator

Useful! :) Overall I like the extra information, but I feel it lacks some links to other resources. We have a plethora of information on the website we should link to

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

KDocs Improvements or additions to KDocs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants