Using cross-validation technique for a CNN model?Validation vs. test vs. training accuracy. Which one should I compare for claiming overfit?Convolutional Neural Network not learning EEG dataConsistently inconsistent cross-validation results that are wildly different from original model accuracyReporting test result for cross-validation with Neural NetworkDecision tree classifier: possible overfittingTaking average of multiple neural networks?Interpreting confusion matrix and validation results in convolutional networksDifficulty in choosing Hyperparameters for my CNNsklearn cross_validate without test/train splitOversampling before Cross-Validation, is it a problem?Stop CNN model at high accuracy and low loss rate?

Mixing PEX brands

Why does a simple loop result in ASYNC_NETWORK_IO waits?

User Story breakdown - Technical Task + User Feature

What exactly color does ozone gas have?

Do the primes contain an infinite almost arithmetic progression?

How to say when an application is taking the half of your screen on a computer

This is why we puzzle

How could a planet have erratic days?

Store Credit Card Information in Password Manager?

Redundant comparison & "if" before assignment

Did arcade monitors have same pixel aspect ratio as TV sets?

How to cover method return statement in Apex Class?

What should you do if you miss a job interview (deliberately)?

Unexpected behavior of the procedure `Area` on the object 'Polygon'

Biological Blimps: Propulsion

Mimic lecturing on blackboard, facing audience

Can I say "fingers" when referring to toes?

Open a doc from terminal, but not by its name

Why "had" in "[something] we would have made had we used [something]"?

Does Doodling or Improvising on the Piano Have Any Benefits?

What to do when eye contact makes your subordinate uncomfortable?

Fear of getting stuck on one programming language / technology that is not used in my country

Non-trope happy ending?

What are the balance implications behind making invisible things auto-hide?



Using cross-validation technique for a CNN model?


Validation vs. test vs. training accuracy. Which one should I compare for claiming overfit?Convolutional Neural Network not learning EEG dataConsistently inconsistent cross-validation results that are wildly different from original model accuracyReporting test result for cross-validation with Neural NetworkDecision tree classifier: possible overfittingTaking average of multiple neural networks?Interpreting confusion matrix and validation results in convolutional networksDifficulty in choosing Hyperparameters for my CNNsklearn cross_validate without test/train splitOversampling before Cross-Validation, is it a problem?Stop CNN model at high accuracy and low loss rate?













2












$begingroup$


I am working on the CNN model, as always I use batches with epochs to train my model, for my model, when it completed training and validation, finally I use a test set to measure the model performance and generate confusion matrix, now I want to use cross-validation to train my model, I can implement it but there are some questions in my mind, my questions are:



1- why most CNN models not using cross-validation technique?



2- if I use cross-validation how can I generate confusion matrix? can I split dataset to train/test then do cross-validation on train set as train/validation (i.e. doing cross-validation as train/validation except for the usual train/test) and at last use test set the same way? or how?










share|improve this question









$endgroup$
















    2












    $begingroup$


    I am working on the CNN model, as always I use batches with epochs to train my model, for my model, when it completed training and validation, finally I use a test set to measure the model performance and generate confusion matrix, now I want to use cross-validation to train my model, I can implement it but there are some questions in my mind, my questions are:



    1- why most CNN models not using cross-validation technique?



    2- if I use cross-validation how can I generate confusion matrix? can I split dataset to train/test then do cross-validation on train set as train/validation (i.e. doing cross-validation as train/validation except for the usual train/test) and at last use test set the same way? or how?










    share|improve this question









    $endgroup$














      2












      2








      2





      $begingroup$


      I am working on the CNN model, as always I use batches with epochs to train my model, for my model, when it completed training and validation, finally I use a test set to measure the model performance and generate confusion matrix, now I want to use cross-validation to train my model, I can implement it but there are some questions in my mind, my questions are:



      1- why most CNN models not using cross-validation technique?



      2- if I use cross-validation how can I generate confusion matrix? can I split dataset to train/test then do cross-validation on train set as train/validation (i.e. doing cross-validation as train/validation except for the usual train/test) and at last use test set the same way? or how?










      share|improve this question









      $endgroup$




      I am working on the CNN model, as always I use batches with epochs to train my model, for my model, when it completed training and validation, finally I use a test set to measure the model performance and generate confusion matrix, now I want to use cross-validation to train my model, I can implement it but there are some questions in my mind, my questions are:



      1- why most CNN models not using cross-validation technique?



      2- if I use cross-validation how can I generate confusion matrix? can I split dataset to train/test then do cross-validation on train set as train/validation (i.e. doing cross-validation as train/validation except for the usual train/test) and at last use test set the same way? or how?







      python deep-learning






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked 6 hours ago









      honar.cshonar.cs

      10812




      10812




















          1 Answer
          1






          active

          oldest

          votes


















          4












          $begingroup$


          Question 1: Why do most CNN models not apply the cross-validation technique?




          $k$-fold cross-validation is often used for simple models with few parameters, models with simple hyperparameters and additionally the models are easy to optimize. Typical examples are linear regression, logistic regression, small neural networks and support vector machines.
          For a convolutional neural network with many parameters (e.g. more than one million) we just have too many possible changes in the architecture. What you can do is to do some experiments with the learning rate, batch size, dropout (amount and position) and batch normalization (position). Training a convolutional neural network with a huge dataset takes quite a long time. Doing hyperparameter optimization would just be total overkill. Often in papers, they try to improve the results of other research papers. It is not the goal to get better results by improving the chosen hyperparameters but rather to come up with new ideas to solve the given task but with better accuracy or less computational effort.




          Question 2: If I use cross-validation how can I generate confusion
          matrix? can I split dataset to train/test then do cross-validation on
          train set as train/validation (i.e. doing cross-validation as
          train/validation except for the usual train/test) and at last use test
          set the same way? or how?




          In order to do $k$-fold cross validation you will need to split your initial data set into two parts. One dataset for doing the hyperparameter optimization and one for the final validation. Then we take the dataset for the hyperparameter optimization and split it into $k$ (hopefully) equally sized data sets $mathcalD_1,mathcalD_2,ldots,mathcalD_k$. For the sake of clarity let us set $k=3$. Then for each possible hyperparameter combination that we want to test we use $mathcalD_1$ and $mathcalD_2$ to fit our model and we use $mathcalD_3$ to validate our model. Then we do the same with $mathcalD_2$ and $mathcalD_3$ and use $mathcalD_1$ for validation. Then we do the same with $mathcalD_1$ and $mathcalD_3$ and use $mathcalD_2$ for validation. We will get $3$ confusion matrices for every possible hyperparameter configuration. In order to derive a metric from these three results, we take the mean of these confusion matrices. Then we can scan through all averaged confusion matrices so select the hyperparameter configuration that was the best (you have to define what parts of the confusion matrix are important for your problem). Finally, we pick the 'best' hyperparameters and calculate the prediction performance on the final validation set. This performance metrics are the ones that you report.






          share|improve this answer








          New contributor




          MachineLearner is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
          Check out our Code of Conduct.






          $endgroup$












            Your Answer





            StackExchange.ifUsing("editor", function ()
            return StackExchange.using("mathjaxEditing", function ()
            StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
            StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
            );
            );
            , "mathjax-editing");

            StackExchange.ready(function()
            var channelOptions =
            tags: "".split(" "),
            id: "557"
            ;
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function()
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled)
            StackExchange.using("snippets", function()
            createEditor();
            );

            else
            createEditor();

            );

            function createEditor()
            StackExchange.prepareEditor(
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: false,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: null,
            bindNavPrevention: true,
            postfix: "",
            imageUploader:
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            ,
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            );



            );













            draft saved

            draft discarded


















            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f47797%2fusing-cross-validation-technique-for-a-cnn-model%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown

























            1 Answer
            1






            active

            oldest

            votes








            1 Answer
            1






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            4












            $begingroup$


            Question 1: Why do most CNN models not apply the cross-validation technique?




            $k$-fold cross-validation is often used for simple models with few parameters, models with simple hyperparameters and additionally the models are easy to optimize. Typical examples are linear regression, logistic regression, small neural networks and support vector machines.
            For a convolutional neural network with many parameters (e.g. more than one million) we just have too many possible changes in the architecture. What you can do is to do some experiments with the learning rate, batch size, dropout (amount and position) and batch normalization (position). Training a convolutional neural network with a huge dataset takes quite a long time. Doing hyperparameter optimization would just be total overkill. Often in papers, they try to improve the results of other research papers. It is not the goal to get better results by improving the chosen hyperparameters but rather to come up with new ideas to solve the given task but with better accuracy or less computational effort.




            Question 2: If I use cross-validation how can I generate confusion
            matrix? can I split dataset to train/test then do cross-validation on
            train set as train/validation (i.e. doing cross-validation as
            train/validation except for the usual train/test) and at last use test
            set the same way? or how?




            In order to do $k$-fold cross validation you will need to split your initial data set into two parts. One dataset for doing the hyperparameter optimization and one for the final validation. Then we take the dataset for the hyperparameter optimization and split it into $k$ (hopefully) equally sized data sets $mathcalD_1,mathcalD_2,ldots,mathcalD_k$. For the sake of clarity let us set $k=3$. Then for each possible hyperparameter combination that we want to test we use $mathcalD_1$ and $mathcalD_2$ to fit our model and we use $mathcalD_3$ to validate our model. Then we do the same with $mathcalD_2$ and $mathcalD_3$ and use $mathcalD_1$ for validation. Then we do the same with $mathcalD_1$ and $mathcalD_3$ and use $mathcalD_2$ for validation. We will get $3$ confusion matrices for every possible hyperparameter configuration. In order to derive a metric from these three results, we take the mean of these confusion matrices. Then we can scan through all averaged confusion matrices so select the hyperparameter configuration that was the best (you have to define what parts of the confusion matrix are important for your problem). Finally, we pick the 'best' hyperparameters and calculate the prediction performance on the final validation set. This performance metrics are the ones that you report.






            share|improve this answer








            New contributor




            MachineLearner is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
            Check out our Code of Conduct.






            $endgroup$

















              4












              $begingroup$


              Question 1: Why do most CNN models not apply the cross-validation technique?




              $k$-fold cross-validation is often used for simple models with few parameters, models with simple hyperparameters and additionally the models are easy to optimize. Typical examples are linear regression, logistic regression, small neural networks and support vector machines.
              For a convolutional neural network with many parameters (e.g. more than one million) we just have too many possible changes in the architecture. What you can do is to do some experiments with the learning rate, batch size, dropout (amount and position) and batch normalization (position). Training a convolutional neural network with a huge dataset takes quite a long time. Doing hyperparameter optimization would just be total overkill. Often in papers, they try to improve the results of other research papers. It is not the goal to get better results by improving the chosen hyperparameters but rather to come up with new ideas to solve the given task but with better accuracy or less computational effort.




              Question 2: If I use cross-validation how can I generate confusion
              matrix? can I split dataset to train/test then do cross-validation on
              train set as train/validation (i.e. doing cross-validation as
              train/validation except for the usual train/test) and at last use test
              set the same way? or how?




              In order to do $k$-fold cross validation you will need to split your initial data set into two parts. One dataset for doing the hyperparameter optimization and one for the final validation. Then we take the dataset for the hyperparameter optimization and split it into $k$ (hopefully) equally sized data sets $mathcalD_1,mathcalD_2,ldots,mathcalD_k$. For the sake of clarity let us set $k=3$. Then for each possible hyperparameter combination that we want to test we use $mathcalD_1$ and $mathcalD_2$ to fit our model and we use $mathcalD_3$ to validate our model. Then we do the same with $mathcalD_2$ and $mathcalD_3$ and use $mathcalD_1$ for validation. Then we do the same with $mathcalD_1$ and $mathcalD_3$ and use $mathcalD_2$ for validation. We will get $3$ confusion matrices for every possible hyperparameter configuration. In order to derive a metric from these three results, we take the mean of these confusion matrices. Then we can scan through all averaged confusion matrices so select the hyperparameter configuration that was the best (you have to define what parts of the confusion matrix are important for your problem). Finally, we pick the 'best' hyperparameters and calculate the prediction performance on the final validation set. This performance metrics are the ones that you report.






              share|improve this answer








              New contributor




              MachineLearner is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
              Check out our Code of Conduct.






              $endgroup$















                4












                4








                4





                $begingroup$


                Question 1: Why do most CNN models not apply the cross-validation technique?




                $k$-fold cross-validation is often used for simple models with few parameters, models with simple hyperparameters and additionally the models are easy to optimize. Typical examples are linear regression, logistic regression, small neural networks and support vector machines.
                For a convolutional neural network with many parameters (e.g. more than one million) we just have too many possible changes in the architecture. What you can do is to do some experiments with the learning rate, batch size, dropout (amount and position) and batch normalization (position). Training a convolutional neural network with a huge dataset takes quite a long time. Doing hyperparameter optimization would just be total overkill. Often in papers, they try to improve the results of other research papers. It is not the goal to get better results by improving the chosen hyperparameters but rather to come up with new ideas to solve the given task but with better accuracy or less computational effort.




                Question 2: If I use cross-validation how can I generate confusion
                matrix? can I split dataset to train/test then do cross-validation on
                train set as train/validation (i.e. doing cross-validation as
                train/validation except for the usual train/test) and at last use test
                set the same way? or how?




                In order to do $k$-fold cross validation you will need to split your initial data set into two parts. One dataset for doing the hyperparameter optimization and one for the final validation. Then we take the dataset for the hyperparameter optimization and split it into $k$ (hopefully) equally sized data sets $mathcalD_1,mathcalD_2,ldots,mathcalD_k$. For the sake of clarity let us set $k=3$. Then for each possible hyperparameter combination that we want to test we use $mathcalD_1$ and $mathcalD_2$ to fit our model and we use $mathcalD_3$ to validate our model. Then we do the same with $mathcalD_2$ and $mathcalD_3$ and use $mathcalD_1$ for validation. Then we do the same with $mathcalD_1$ and $mathcalD_3$ and use $mathcalD_2$ for validation. We will get $3$ confusion matrices for every possible hyperparameter configuration. In order to derive a metric from these three results, we take the mean of these confusion matrices. Then we can scan through all averaged confusion matrices so select the hyperparameter configuration that was the best (you have to define what parts of the confusion matrix are important for your problem). Finally, we pick the 'best' hyperparameters and calculate the prediction performance on the final validation set. This performance metrics are the ones that you report.






                share|improve this answer








                New contributor




                MachineLearner is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                Check out our Code of Conduct.






                $endgroup$




                Question 1: Why do most CNN models not apply the cross-validation technique?




                $k$-fold cross-validation is often used for simple models with few parameters, models with simple hyperparameters and additionally the models are easy to optimize. Typical examples are linear regression, logistic regression, small neural networks and support vector machines.
                For a convolutional neural network with many parameters (e.g. more than one million) we just have too many possible changes in the architecture. What you can do is to do some experiments with the learning rate, batch size, dropout (amount and position) and batch normalization (position). Training a convolutional neural network with a huge dataset takes quite a long time. Doing hyperparameter optimization would just be total overkill. Often in papers, they try to improve the results of other research papers. It is not the goal to get better results by improving the chosen hyperparameters but rather to come up with new ideas to solve the given task but with better accuracy or less computational effort.




                Question 2: If I use cross-validation how can I generate confusion
                matrix? can I split dataset to train/test then do cross-validation on
                train set as train/validation (i.e. doing cross-validation as
                train/validation except for the usual train/test) and at last use test
                set the same way? or how?




                In order to do $k$-fold cross validation you will need to split your initial data set into two parts. One dataset for doing the hyperparameter optimization and one for the final validation. Then we take the dataset for the hyperparameter optimization and split it into $k$ (hopefully) equally sized data sets $mathcalD_1,mathcalD_2,ldots,mathcalD_k$. For the sake of clarity let us set $k=3$. Then for each possible hyperparameter combination that we want to test we use $mathcalD_1$ and $mathcalD_2$ to fit our model and we use $mathcalD_3$ to validate our model. Then we do the same with $mathcalD_2$ and $mathcalD_3$ and use $mathcalD_1$ for validation. Then we do the same with $mathcalD_1$ and $mathcalD_3$ and use $mathcalD_2$ for validation. We will get $3$ confusion matrices for every possible hyperparameter configuration. In order to derive a metric from these three results, we take the mean of these confusion matrices. Then we can scan through all averaged confusion matrices so select the hyperparameter configuration that was the best (you have to define what parts of the confusion matrix are important for your problem). Finally, we pick the 'best' hyperparameters and calculate the prediction performance on the final validation set. This performance metrics are the ones that you report.







                share|improve this answer








                New contributor




                MachineLearner is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                Check out our Code of Conduct.









                share|improve this answer



                share|improve this answer






                New contributor




                MachineLearner is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                Check out our Code of Conduct.









                answered 5 hours ago









                MachineLearnerMachineLearner

                30810




                30810




                New contributor




                MachineLearner is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                Check out our Code of Conduct.





                New contributor





                MachineLearner is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                Check out our Code of Conduct.






                MachineLearner is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                Check out our Code of Conduct.



























                    draft saved

                    draft discarded
















































                    Thanks for contributing an answer to Data Science Stack Exchange!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid


                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.

                    Use MathJax to format equations. MathJax reference.


                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function ()
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f47797%2fusing-cross-validation-technique-for-a-cnn-model%23new-answer', 'question_page');

                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    Are there any comparative studies done between Ashtavakra Gita and Buddhim?How is it wrong to believe that a self exists, or that it doesn't?Can you criticise or improve Ven. Bodhi's description of MahayanaWas the doctrine of 'Anatta', accepted as doctrine by modern Buddhism, actually taught by the Buddha?Relationship between Buddhism, Hinduism and Yoga?Comparison of Nirvana, Tao and Brahman/AtmaIs there a distinction between “ego identity” and “craving/hating”?Are there many differences between Taoism and Buddhism?Loss of “faith” in buddhismSimilarity between creation in Abrahamic religions and beginning of life in Earth mentioned Agganna Sutta?Are there studies about the difference between meditating in the morning versus in the evening?Can one follow Hinduism and Buddhism at the same time?Are there any prohibitions on participating in other religion's practices?Psychology of 'flow'

                    Where else does the Shulchan Aruch quote an authority by name?Parashat Metzora+HagadolPesach/PassoverShulchan Aruch UTF-8Anonymous glosses in the Shulchan AruchWhy is the Shulchan Aruch definitive?Siman 32, Kitzur Shulchan Aruch: UntranslatedLitvaks/Yeshivish and Shulchan AruchBuying a Shulchan AruchEnglish version of SHULCHAN ARUCHIs there any place where Shulchan Aruch rules with the Rosh against the Rif and Rambam?Are there practices where Sepharadim do not hold by Shulchan Aruch?5th part of the shulchan aruch

                    fallocate: fallocate failed: Text file busy in Ubuntu 17.04? Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern)defragmenting and increasing performance of old lubuntu system with swap partitionIssue with increasing the root partition from the swapthis /usr/bin/dpkg returned error || ubuntu-16.04, 64bitDefault 17.04 swap file locationHow to Resize Ubuntu 17.04 Zesty Swap file size?Ubuntu freezes from online formsMy Laptop is not starting after upgrade ubuntu 16.04 (Kernel 4.8.0-38 to 04.10.0-36)hcp: ERROR: FALLOCATE FAILED!Not sure my swap is being usedWine 3.0 asking for more virtual free swap