Skip to content

Fix: Potential Data Leakage in Quantum Data Tutorial.#829

Closed
OkuyanBoga wants to merge 1 commit intotensorflow:masterfrom
OkuyanBoga:fix-data-leakage-in-tutorial
Closed

Fix: Potential Data Leakage in Quantum Data Tutorial.#829
OkuyanBoga wants to merge 1 commit intotensorflow:masterfrom
OkuyanBoga:fix-data-leakage-in-tutorial

Conversation

@OkuyanBoga
Copy link

A solution to potential data leakage in #828.

Instead of concatenating train and test sets, they should be separately dealt with when getting a stilted dataset:

In lines L745-752:

y_train_new = get_stilted_dataset(S_pqk, V_pqk, S_original, V_original)
y_test_new = get_stilted_dataset(S_pqk_test, V_pqk_test, S_test_original, V_test_original)

where spectrum is calculated separately for test set:

S_pqk_test, V_pqk_test = get_spectrum(
    tf.reshape(x_test_pqk, [-1, len(qubits) * 3]))

S_test_original, V_test_original = get_spectrum(
    tf.cast(x_test, tf.float32), gamma=0.005)

print('Eigenvectors of pqk kernel matrix for test:', V_pqk_test)
print('Eigenvectors of original kernel matrix for test:', V_test_original)

Closes #828.

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@mhucka
Copy link
Member

mhucka commented Feb 25, 2026

@OkuyanBoga Thank you for this contribution. Would you be able to resolve the (simple) conflicts that have arisen? I will then review the PR.

@mhucka mhucka self-assigned this Feb 25, 2026
@mhucka mhucka added the area/docs Involves documentation – problems, ideas, requests label Feb 25, 2026
@mhucka
Copy link
Member

mhucka commented Mar 6, 2026

Closing due to age and nonresponse.

@mhucka mhucka closed this Mar 6, 2026
@OkuyanBoga
Copy link
Author

Hi, sorry for late response but I think the issue I shown here breaks the whole tutorial. If there is not a leakage, the performance of the method reduces significantly.

Any suggestions or comments?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/docs Involves documentation – problems, ideas, requests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Possible data leakage in quantum/docs/tutorials /quantum_data.ipynb

2 participants